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Abstract 


In randomized distributed computing, executions encounter branch points resolved 
either randomly or non-deterministically. Random decisions and non-deterministic 
choices interact and affect each other in subtle ways. This thesis is devoted to 
the analysis and illustration of the effects of the interplay between randomness and 
non-determinism in randomized computing. 


Using ideas from game theory, we provide a general model for randomized comput- 
ing which formalizes the mutual effects of randomization and non-determinism. An 
advantage of this model over previous models is that it is particularly effective for 
expressing mathematical proofs of correctness in two difficult domains in random- 
ized computing. The first domain is the analysis of randomized algorithms where 
non-deterministic choices are made based on a limited knowledge of the execution 
history. The second domain concerns the establishment of lower- bounds and proofs 
of optimality. 


The advantage of this model are described in the context of three problems. First, 
we consider the classical randomized algorithm for mutual exclusion [49] of Ra- 
bin. This algorithm illustrates perfectly the difficulties encountered when the non- 
deterministic choices are resolved based on a limited knowledge of execution history. 


We then analyze the Lehmann-Rabin Dining Philosophers algorithm (1981). Our 
analysis provides a general method for deriving probabilistic time bounds for ran- 
domized executions. 


In the last part, we analyze a scheduling problem and give solutions in both the 
deterministic and the randomized cases. Lower bounds arguments show these solu- 
tions to be optimal. For the randomized case, we take full advantage of the game 
theoretic interpretation of our general model. In particular, the proof of optimality 
reflects Von-Neumann’s duality for matrix games. 
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Chapter 1 


Introduction 


For many distributed problems, it is possible to produce randomized algorithms 
that are better than their deterministic counterparts: they may be more efficient, 
have simpler structure, and even achieve correctness properties that deterministic 
algorithms cannot. One problem with using randomization is the increased difficulty 
in analyzing the resulting algorithms. This thesis is concerned with this issue and 
provides formal methods and examples for the analysis of randomized algorithms, 
the proof of their correctness, the evaluation of their performance and in some 
instance the proof of their optimality. 


By definition a randomized algorithm is one whose code can contain random choices, 
which lead to probabilistic branch points in the tree of executions. In order to 
perform these random choices the algorithm is provided at certain points of the 
execution with random inputs having known distributions: these random inputs are 
often called coin tosses and we accordingly say that the algorithm flips a coin to 
make a choice. 


A major difficulty in the analysis of a randomized algorithm is that the code of 
the algorithm and the value of the random inputs do not always completely char- 
acterize the execution: the execution sometimes branches according to some non- 
deterministic choices which are not in the control of the algorithm. Typical examples 
of such choices are the choices of the inputs of the algorithm (in which case the cor- 
responding branch point is at the very beginning of the execution), the scheduling 
of the processes (in a distributed environment), the control of the faults (in a faulty 
environment) and the changes in topology (in a dynamic environment). For the 
sake of modeling we call adversary an entity controlling these choices. In a general 
situation an adversary has also access to random sources to make these choices. (In 
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this case the adversary decides non-deterministically the probability distribution of 
the coin.) 


A randomized algorithm therefore typically involves two different types of nonde- 
terminism — that arising from the random choices whose probability distributions is 
specified in the code, and that arising from an adversary, resolving by definition all 
the choices for which no explicit randomized mechanism of decision is provided in 
the code. 


The interaction between these two kinds of nondeterminism complicates significantly 
the analysis of randomized algorithms and is at the core of many mistakes. To un- 
derstand the issues at stake consider a typical execution of a randomized algorithm. 
The execution runs as prescribed by the code of the algorithm until a decision not 
in the control of the code has to be resolved. (For instance, in a distributed context, 
the adversary decides, among other things, the order in which processes take steps.) 
This part of the execution typically involves random choices and therefore, the state 
reached by the system when a decision of the adversary is required is also random. 
Generally, the decision of the adversary depends on the random value of the state, 
8,, reached. Its decision, a,, in turn, characterizes the probabilistic way the execu- 
tion proceeds in its second part: the branch of the code then followed is specified by 
&,a,. The execution proceeds along that branch, branching randomly as specified 
by the code until a second non-deterministic branch point must be resolved. The 
adversary then makes a decision. This decision generally depends on the random 
value of the state, sy, reached, and, in turn, characterizes the probabilistic way the 
execution proceeds in its third part ... The execution thus proceeds, in a way where 
the random inputs used by the algorithm influence the decisions of the adversary; 
decisions which in turn determine the probability distributions of the random inputs 
used by the algorithm subsequently. 


The analysis and the measure of the performance of randomized algorithms is usually 
performed in the worst case setting. An algorithm “performs well” if it does so 
against all adversaries. For example, among the correctness properties one often 
wishes to prove for randomized algorithms are properties that state that a certain 
property of executions has a “high” probability of holding against all adversaries; or 
that a certain random variable depending on the executions, (e.g., a running time), 
has a “small” expected value for all adversaries. 


The proof of such properties often entails significant difficulties the first of which is to 
make formal sense of the property claimed. Such statements often implicitly assume 
the existence of a probability space whose sample space is the set of executions, and 
whose probability distribution is induced by the distribution of the random inputs 
used by the algorithm. But what is “this” probability space? As we saw, the 
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random inputs used during an execution depend on the decisions made previously 
in the execution by the adversary. This shows that we do not have one, but instead 
a family of probability spaces, one for each adversary. For each fixed adversary, 
the coins used by the algorithm are well-defined and characterize the probabilistic 
nature of the executions. 


The analysis of a given randomized algorithm a therefore requires one to model 
the set of adversaries to be considered with 7: we refer to them as the set of 
admissible adversaries. We must then construct (if possible) the probability space 
(Q.4,G.4, Ps) corresponding to each adversary A. It should be noted that the choice 
A of the adversary not only affects the distribution Py, if it exists, induced by the 
random inputs on the set of executions, but also the set of executions itself. This 
remark justifies that the analysis of a randomized algorithm requires one to consider 
a different sample space (1.4 and a different o-field Gy for every adversary A. 


Most authors, including the pioneers in the area of randomized computing, are aware 
that the adversary influences the probability distribution on the set of executions. 
For instance, in an early paper [37] Lehmann and Rabin define a schedule’ A to be 


“a function which assigns to every past behavior of the n processes the 
process whose turn is next to be active... Under past behavior we mean 
the complete sequence of atomic actions and random draws with their 
results, up to that time .... This captures the idea that, for any spe- 
cific system, what will happen next depends on the whole history of past 
successes and failures of the processes ... as well as on what happened 


internally within the processes. ” 


To each such A, the same paper [37] associates a probability distribution on the set 
of executions. We quote: 


“For a given schedule A and specific outcomes of the random draws t, 
we get a particular computation w = COM(A,t) .... On the space of all 
possible outcomes of random draws t we impose the uniform distribution. 
The function COM then associates with every schedule A a probability 
distribution on the set of all computation, the probability of a set E of 
computations being defined as the probability of the set of sequences of 
random draws t such that COM(A, +t) is in E. ” 


‘Tn [37] a schedule corresponds to what we call an adversary. 
?[37] uses the notation S in place of A, D in place of « and C in place of w. We use A,z and w 
to be consistent with the rest of our discussion. 
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This approach presents well the direction to be followed in a formal analysis. Nev- 
ertheless many of the models and analyses published so far suffer from various 
limitations, incompleteness and sometimes mistakes that we summarize now. We 
use the fact that most of the existing work on randomized distributed computing 
can be classified into one of two classes. 


To begin, there is on the one hand a very rich body of work analyzing the semantics 
and logics of randomized computing (cf. [2, 18, 19, 23, 28, 29, 32, 38, 47, 51, 55, 58]). 
The emphasis in most of these papers is to provide a unified semantics or model of 
computation for the description of randomized algorithms; and to recognize some 
proof rules, methods and tools allowing the automatic verification of some specific 
subclasses of algorithms. The results thus obtained are typically limited in the 
following three ways. A major limitation is that very little in general can be said 
for properties of randomized algorithms that do not hold with probability one: such 
properties are usually too specific to hold in general situations, and also usually too 
hard to be simple consequences of general ergodic theory. For instance, the typical 
problem considered by Vardi in [58] is the “probabilistic universality problem”, 
where one checks whether a formula belonging to some temporal logic holds with 
probability one. (This problem can be reformulated in an equivalent way using 
w-automata.) To quote [58], these methods 


“deal only with qualitative correctness. One has often quantitative cor- 
rectness conditions such as bounded waiting time [49] or real-time re- 
sponses [52]. Verifications of these conditions requires totally different 
techniques.” 


A second limitation of these methods is that they are directed at randomized al- 
gorithms whose correctness reflects only the asymptotic properties of infinite ex- 
ecutions. A translation of this fact is their relative success in expressing liveness 
properties and their failure in expressing the short-term behavior and correctness of 
the studied algorithms. A third limitation is that, in their generality these methods 
are able to take into account only very marginally the exact numeric distributions 
used for the random inputs. These considerations lead the authors of [38] to say 
that, in their model, 


“only very basic facts about probability theory are required to prove the 
properties needed. Essentially one does not need anything more than: 
“of [ throw a coin an infinite number of times then it will fall an infinite 
number of times on heads.” 
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A second big body of work in distributed randomized computing is devoted to the 
design and analysis of specific randomized algorithms. 


These algorithms typically introduce randomness into algorithms for synchroniza- 
tion, communication and coordination between concurrent processes (cf. [3, 37, 49, 
36, 48, 50, 14, 21, 25]). If correct (1), these algorithms solve problems that have been 
proven unsolvable by deterministic algorithms [3, 37, 21, 14]. Others improve on 
deterministic algorithms by various measures [49, 36, 50]. The analysis and proof of 
correctness of these algorithms is complex as is attested by the fact that subsequent 
papers were published providing new proofs of the same claims (cf. [29, 47]). 


In spite of successive improvements most of the analyses presented in these papers 
suffer from the fact that the proofs and often even the statements are not expressed 
formally using the probability distributions P, and that the proofs do not track pre- 
cisely how the adversaries can influence the probabilities. The paper by Aspnes and 
Herlihy [3] is one of the few presenting formally the correctness statement claimed. 
Nevertheless no explicit use of the measures Py is made during the proof. The 
claims and arguments in the papers [14, 37] and [49] similarly lack a truly formal 
treatment. As a consequence of this lack of formalism, many proofs are effectively 
unreliable to readers not willing solely to rely on claims and neither willing to spend 
the time to reach an intuitive understanding of the algorithm. 


To summarize, the limitations encountered in the papers of the first class stem 
from the desire to develop automatic methods of verification for a wide class of 
randomized algorithms: the wider the class, the less intricate each algorithm can 
be. In contrast, from our point of view, the question of generating the proofs of 
correctness of randomized algorithms can be left to the ingenuity of each prover. 
Instead, our ambition is to derive a probabilistic framework within which most if not 
all mathematical analyses and proofs related to specific randomized algorithms can 
be expressed. This requires two types of construction. First, we need to develop 
a general model defining algorithms and adversaries and formalizing their interac- 
tions. (We call this a general model for randomized computing.) Then, for each 
algorithm/adversary structure (described within the general model for randomized 
computing), we need to characterize the probability spaces used in the analysis. 


To understand better the nature of the work required we describe two situations 
that a general model for randomized computing ought to address. We begin with a 
situation very rarely addressed in the literature but whose consideration places the 
problems encountered in the right perspective. 
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Lower bounds, game theory. Most of the work existing on the analysis of dis- 
tributed randomized algorithms and quoted above either provides an analysis of one 
existing algorithm or provides some tools geared at providing such an analysis. All 
these results can be called upper-bound results: they show that a given problem can 
be solved at a given level of performance by providing an algorithm having that level 
of performance. On the other hand very little work has been done in the direction of 
exact lower-bounds of a randomized distributed problem and in establishing the op- 
timality of a randomized solution to this problem. The lack of formal development 
in this direction is mostly a reflection of the complications involved in randomized 
lower bounds and of the dearth of published work in this area.? One exception is 
provided by the combined work of Graham, Yao and Karlin [25, 33] who provide 
precise lower bounds and a proof of optimality for the randomized (3,1)-Byzantine 
Generals Problem: Byzantine broadcast with 3 processes, one of which is faulty. 
Also, in [35], Kushilevitz et al. present some asymptotic lower bounds on the size 
of the shared random variable for randomized algorithms for mutual exclusion. 


A general model for randomized computing ought to provide a solid framework 
allowing also the formal analysis of such lower bounds. In contrast with upper 
bound proofs, the proofs of lower bounds require a model allowing the analyses 
and comparison of a family of algorithms. These algorithms must be evaluated in 
relation with a family of adversaries. 


As mentioned in [3], page 443, [29], page 367, [46], page 145, the situation is actually 
best understood in the language of game theory. The “unified measure of complexity 
in probabilistic computations” proposed in 1977 in [61] by Yao also adopts (implic- 
itly) this point of view and relies fundamentally on Von Neumann’s theorem in game 
theory. We will adopt explicitly this point of view in the sequel and let Player(1) 
be the entity selecting the algorithm and Player(2) be the entity selecting the ad- 
versary. In this language, algorithms and adversaries are strategies of respectively 
Player(1) and Player(2). 


If Player(1) selects the algorithm a and if Player(2) selects the adversary A, the 
game played by the two players consists of the alternative actions of the algorithm 
and the adversary: Player(1) takes all the actions as described by 7 until the first 


“The extreme difficulty of proving lower bounds for randomized algorithms is reflected in the 
fact that [33] is one of the most referenced non-published papers! 

Note also that lower bounds for on-line algorithms (cf., for instance, [7, 15]) are of a very different 
type. These algorithms are not evaluated by their absolute performance but instead by some relative 
measure, usually the ratio of their performance and the performance of the best off-line algorithm. 
In such problems the coupling between the adversary and the algorithm is often easily analyzable, 
using competitive analysis. By contrast, the coupling between adversary and algorithm is much 
harder to analyze in general situations where absolute measures of performance are used. 
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point where some choice has to be resolved by the adversary; Player(2) then takes 
actions to resolve this choice as described by A and Player(1) resumes action once 
the choice has been resolved .... In this game, the two players have conflicting 
objectives: Player(1) selects its strategy so as to heighten the performance studied 
whereas Player(2) selects its own strategy so as to lower it. The performance in 
question depends on the problem studied. 


For instance, in [49] the performance expresses a measure of fairness among par- 
ticipating processes and Player(2) “tries” to be unfair to some process. In [37], 
the performance expresses that “progress occurs fast” and Player(2) tries to slow 
the occurrence of progress. In [3], the performance is measured by the expected 
time to consensus and Player(2) chooses inputs and schedules so as to increase that 
expected time. Also, let us mention that adopting a game point of view provides a 
unifying description of the complex situation studied in [25] by Graham and Yao. 
This paper studies the resiliency of a certain class of algorithms under the failure 
of a single process. There, Player(2) fulfills the manifold following functions. It re- 
ceives some information about the algorithm selected, selects both the initial input 
and the identity of the faulty process, and then monitors the messages sent by the 
faulty process in the course of the execution. 


To finish let us remark that considering the relation between randomized algorithms 
and adversaries as a game between Player(1) and Player(2) is also very useful when 
a fixed algorithm 7 is analyzed.4| This situation simply corresponds to the case 
where Player(1) has by assumption a single strategy 7. 


Formalizing the notion of knowledge. It should be intuitively clear that an 
optimal adversary is one that will optimize at each of its steps the knowledge it holds 
so as to take decisions most detrimental to the performance sought by Player(1). 
Similarly, in cases where Player(1) has more then one strategy 7 to choose from, 
(i.e., in cases where more then one algorithm can be considered), the best strategy 
is one that takes best advantage of the knowledge available about the past moves of 
Player(2) and about the strategy implemented by Player(2). 


Establishing the performance of an algorithm is always tantamount to proving that 
“the optimal” adversary cannot reduce the performance below the level of perfor- 
mance claimed for that algorithm. Equivalently, a proof of correctness must in some 
way establish a bound on the usefulness of the knowledge available to Player(2). 


This justifies that a general probabilistic model for randomized computing should 
formalize the notion of knowledge available to the players. And that it should 


“This was actually the original insight of [3, 29, 46]. 
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provide an explicit mechanism of communication between the players, formalizing 
the update of their knowledge as the execution progresses. 


To illustrate this point of view note that in all the formal constructions quoted above 
and associated with the analysis of a specific algorithm (see for instance [3, 28, 29, 
40, 47, 58]) the model proposed for an adversary formalizes that Player(2) makes 
its choices knowing the complete past execution: this was indeed the idea of Rabin 
in [37] quoted above. Nevertheless the correctness of some published algorithms 
depends critically on the fact that Player(2) is allowed only a partial knowledge of 
the state of the system and/or has only a finite memory of the past. The example 
of Rabin’s randomized algorithm for mutual exclusion [49] is very interesting in this 
respect. ([49] is one of the few papers venturing into that area.) Due to the lack 
of model, Rabin resorts to intuition to present and establish the main correctness 
property. We quote: 


“In the context of our study we do not say how the schedule arises, or 
whether there is a mechanism that imposes it .... We could have an 
adversary scheduler who tries to bring about a deadlock or a lockout 
of some process P;. A special case of this evil scheduler is that some 
processes try to cooperate in locking out another process. Our protocol 
is sufficiently robust to have the desired property of fairness for every 


schedule S. 


We have sufficiently explained the notion of protocol to make it unneces- 
sary to give a formal definition: given a protocol x we now have a natural 
notion of arun a@ = 1,X11,X2..., resulting from computing according to 
nm. Again we do not spell out the rather straightforward definition. Note 
that since process 1 may flip coins, even for a fixed schedule S there may 
be many different runs a resulting from computing according to x. ” 


As mentioned previously, the notion of schedule used in [49] corresponds to our 
notion of adversary. Also, the notion of run used in [49] corresponds to what we 
call the knowledge held by Player(2). As demonstrated in Chapter 3 of this thesis, 
a rigorous proof of the correctness of the algorithm presented in [49] should have 
actually required a formal model stating clearly “how the schedule arises” and also 
formalizing the interaction between Player(1) and Player(2): after having formal- 
ized the setting of [49], we establish that the knowledge at the disposal of Player(2) 
is much stronger then what was originally believed, and that the algorithm is not 
“sufficiently robust to have the desired property of fairness for every schedule S$” as 
claimed in [49]. 


The reason for the failure of the proof given in [49] is that it does not take into 
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account the knowledge available to Player(2). Instead it argues solely in terms 
of the distribution of the random inputs; i.e., attempts to prove properties of the 
executions w(A,v) by simply considering the random inputs v. Again, this means 
overlooking the power of Player(2) in influencing the distribution of w(A,v). 


To summarize, we have argued that a general model for randomized computing 
should allow the simultaneous consideration of several algorithms: such situations 
are typically encountered while proving lower bounds. It is natural to view the sit- 
uation as a game between Player(1), the algorithm designer, and Player(2) the ad- 
versary designer. The algorithms are the strategies of Player(1) and the adversaries 
are the strategies of Player(2). This model should provide an explicit mechanism of 
communication between the two players allowing them to update their knowledge 
as the execution progresses. 


The goal of this thesis is to present such a model and to illustrate its generality on 
several examples. We now present the work done in the thesis. 


We present in Chapter 2 a general model corresponding to the previous discussion. 
The model is simply constructed to formalize that both players take steps in turn 
having only a partial knowledge of the state of the system. This knowledge is up- 
dated at every move of either player. We then construct carefully the “natural” 
probability space (Q;.4,Gz,4, Pra) obtained on the set of executions when the algo- 
rithm is a given a and when the adversary is a given A. Our construction requires 
the technical (but fundamental) hypothesis that all the coins used in the game have 
at most countably many outcomes. 


To our knowledge, this last probabilistic construction was similarly never conducted 
to completion. In [58] and [29] Vardi and Hart et al. present the o-field G, 4 that 
allows to study probabilistically events “depending on finitely many conditions”. 
Nevertheless some additional work has to be devoted to justify the existence of a 
probability measure P, 4 on the set of executions. We prove this existence by a 
limiting argument using Kolmogorov’s theorem.? 


°Aspnes and Herlihy in [3] define formally the measure P4 to be the image measure of the 
measure on the random inputs under the mapping: 1 + w(A,«), where « denotes a generic sequence 
of random inputs and w(A,+) is the unique execution corresponding to a given adversary A and a 
given sequence of random inputs s. Indeed, the o-fields are defined precisely so that the mapping 
t + w(A,+) is measurable. But this approach cannot be used in general situations as it presupposes 
a well defined probability measure on the set of random inputs ¢. This is trivially the case when 
the sequence of inputs is obtained with independent coins, in which case the space of random draws 
t is endowed with the product structure. This property holds for many published randomized 
algorithms such as those in [3, 37]. Nevertheless this is not the case in [4] which considers a 
situation where the weight of the coins used is modulated along the execution. Also, this is not 
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An important feature of our model is the symmetry existing between the notions 
of algorithm and of adversary. Eventhough this symmetry is rather natural in the 
light of the game theory interpretation it nevertheless seems to be rather novel with 
respect to the existing work. Most of the models presented so far are concerned with 
the analysis of a single algorithm, i.e., when Player(1) has a single strategy. This 
very specific (eventhough important) situation obscures the natural game theory 
structure that we reveal and led various authors to asymmetric models where the 
adversaries depend specifically on the algorithm considered. As discussed later, our 
Chapter 7 provides a striking illustration in favor of a symmetric model. 


Our Chapter 3 analyzes Rabin’s randomized algorithm for mutual exclusion [49]. 
This algorithm is one of the few in the literature whose correctness hinges critically 
on the limited knowledge available to Player(2). 


As mentioned above, the study of such algorithms is rather complex and requires 
the exact formalization of the notion of knowledge. In the absence of a formal model 
for randomized computing Rabin resorted to intuition in his original paper [49] to 
express and establish the intended correctness statement. One year later, in 1983, 
as an illustration of their general method, Hart, Sharir and Pnueli provided in [29] 
some additional “justifications” to the correctness of the algorithm. 


As part of our analysis we first show how the formal model of Chapter 2 allows 
one to formalize accurately the hypotheses. Our analysis then reveals that the 
correctness of the algorithm is tainted for two radically different reasons. We first 
show that the informal statement proposed by Rabin admits no natural “adequate” 
formalization. The problem is in essence the following. The statement involves 
proving that the probability of a certain event C is “high” against all adversaries 
if a certain precondition B holds. One natural way to formalize such a statement 
might seem to consider the expression 


inf PalC | Bl 


Nevertheless we show that this would be tantamount to giving the knowledge of B 
to Player(2). But Player(2) would not be able to derive the knowledge of B from 
the mere information sent to him in the course of the execution: Player(2) “learns” 


the case in the complex paper [25] where a random adversary adapts its moves based on the past 
execution. Finally, this is not also the case in the general models in [29, 58], where the i-th coin 
used depends on the state s; of the system at the time of the 7-th flip. 

Note nevertheless that the argument of Aspnes and Herlihy in [3] is now valid in the light of our 
construction in Chapter 2 which precisely justifies the existence of a probability distribution on the 
sequence of flips even when these flips are not independent. 
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some non-trivial fact from the conditioning on B. Not surprisingly Player(2) can 
then defeat the algorithm if this measure is used. 


This first problem is not related to the proof of the correctness statement but 
“merely” to its formalization. Our analysis shows that such formalization prob- 
lems are general and proposes a partial method to formalize adequately informal 
high-probability correctness statements. 


We then show that the algorithm suffers from a “real” flaw disproving a weaker but 
adequate (in the sense just sketched) correctness measure. The flaw revealed by 
our analysis is precisely based on the fact that the dynamics of the game between 
the two players allow Player(2) to acquire more knowledge then a naive analysis 
suggests. 


To finish we establish formally a correctness result satisfied by Rabin’s algorithm. 
The method we apply is rather general and proceeds by successive inequalities until 
derivation of a probabilistic expression depending solely on the random inputs. Such 
an expression is independent of Player(2) and its estimation provides a lower bound. 


Our chapter 4 analyzes Lehmann-Rabin’s Dining Philosophers algorithm [37]. In 
spite of its apparent simplicity this algorithm is not trivial to analyze as it requires, 
as usual, to unravel the complex dependencies between the random choices made by 
the algorithm and the non-deterministic choices made by Player(2). The original 
proof given in [37] does not track explicitly how the probabilities depend on the 
adversary A and is therefore incomplete. In a subsequent paper [47], Pnueli and Zuck 
provides a proof of eventual correctness of the algorithm. The method used there 
shares the characteristics of most general semantic methods described on page 14: it 
establishes that some event eventually happens with probability one. Furthermore it 
does not take into account the specific probability distribution used for the random 
inputs. 

We present instead a new method in which one proves auxiliary statements of the 
form U — U’, which means that whenever the algorithm begins in a state in set U, 
with probability p, it will reach a state in set U’ within time t. A key theorem about 
our method is the composability of these U —. U’ arrows allowing to use each of 


P 
these results as building blocks towards a global proof. (This part was developed 
jointly with Roberto Segala.) Our method presents the two following advantages. 


It first provides a tighter measure of progress then “eventual progress”: we provide 
a finite time ¢ and a probability p such that, for all adversaries, progress happens 
within time ¢. This trivially implies the eventual progress with probability-one. 
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As mentioned in page 14, eventual probability-one liveness properties are usually 
considered not because of their intrinsic relevance but because of the limitations of 
the general methods used. Our method uses more specific and refined tools then 
the very very basic facts about probability theory as “if I throw a coin an infinite 
number of times then it will fall an infinite number of times on heads” mentioned 
by Lehmann and Shelah in [38].° Nevertheless it is still general and simple enough 
to apply to a variety of situations and tighten the liveness measures usually derived. 


An additional advantage of our method is that it does not require working with 
events belonging to the tail o-field, an enterprise which can present some subtle 
complications. Recall in effect that the argument proposed by Lehmann-Rabin 
consisted in conditioning on the fact that no progress occurred and then deriving a 
contradiction. Our discussion at the beginning of Chapter 3 shows that conditioning 
on any event — let alone conditioning on an event from the tail o-field — can be 
problematic. 


Our Chapters 5 and 7 are concerned with a scheduling problem in presence of faults. 
Chapter 5 investigates the deterministic case — where algorithms do not use ran- 
domization — and Chapter 7 investigates the randomized case. In both situations 
we are interested in providing optimal algorithms. Both are rather complex even- 
though in very different ways. The solution of Chapter 5 is obtained by translating 
the problem into a graph problem. A key tool is Ore’s Deficiency Theorem giving a 
dual expression of the size of a maximum matching in a bipartite graph. 


In Chapter 7 we describe a randomized scheduling algorithm and establish formally 
its optimality. To our knowledge, Graham and Yao were before us the only ones 
providing the proof of the exact optimality of a randomized algorithm. 


The method that we develop to prove optimality is rather general and in particular 
encompasses in its scope the specific proof strategy used by Graham and Yao in 
their paper [25]. This method is presented by itself in Chapter 6. It is in essence 
based on an application of a min-max equality reversing the roles of Player(1) and 
Player(2) in the scheduling game considered. This min-max method uses critically 
the natural symmetric structure between the two players. The success of this point 
of view also brings a striking illustration of the relevance of a general symmetric 
model for randomized computing as claimed at the beginning of this chapter. In 
particular, critical to the proof is the fact that adversaries are randomized and 
defined independently of any specific algorithm, much in the same way as algorithms 
are randomized and defined independently of any specific adversary. 


°See page 14 of this chapter for a more complete quote. 
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Eventhough the proof of [25] and our proof are both applications of the general 
proof method presented in Chapter 6, the two proofs differ in an essential way. We 
comment on this difference to give an account of the complexities involved in such 
proofs. Critical to the proof of Graham and Yao [25] is the fact that, in their model, 
Player(2) knows explicitly the strategy (i.e., the algorithm) used by Player(1): their 
proof would not hold without this assumption. On the other hand, in the formal 
model that we use, Player(2) is not explicitly provided with the knowledge of the 
strategy used by Player(1), and our proof would not hold if it was. 


Nevertheless, as is argued in Chapter 6, in both [25] and in our work, the optimal- 
ity of an algorithm does not depend on the fact that Player(2) knows or not the 
algorithm under use. Such a fact merely affects the shape taken by the proof. This 
subtle point should make clear, we hope, that formal proofs of optimality are bound 
to be very complex. 


We describe summarily the content of Chapter 6. This chapter presents a general 
proof method to attempt to prove that a given algorithm is optimal. At the core of 
the method is the use of the min-max equality 


max inf f(t, A) = min sup f(7,A), 


where f(,.A) is the performance of the algorithm 7 against the adversary A. Fun- 
damental to us are therefore situations where such an equality occurs: in each of 
these cases our method yields the possibility to prove the optimality of an algorithm. 
We show that this equality occurs in the two following cases. (In the first case the 
equality is a simple reinterpretation of Von Neumann’s theorem.) 


1. When the strategies of either player are the convex combinations of a finite 
set called the set of pure strategies (and when f(a,.A) is the expected value 
FE, a[L] of some random variable T.) 


2. When one player, typically Player(2), “knows” the strategy used by the other 
player. 


The two settings yield different proof systems. Very interestingly our proof of opti- 
mality in Chapter 7 falls in case 1 whereas the proof of optimality [25] of Graham 
Yao falls in case 2. 


To summarize, we present in this thesis a formal and general model for randomized 
computing. Under the assumptions that all the coins have at most countably many 
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outcomes we succeed in constructing formally the probability spaces on the set of 
executions to be used in the analysis of an algorithm. To our knowledge, our model 
is stronger then all models previously constructed in that 1) it accurately allows 
to formalize the precise nature of the knowledge held by both players (this is the 
critical point in Rabin’s randomized algorithm for mutual exclusion); 2) it allows 
to conduct formal lower-bound proofs for randomized computing (no formal model 
existed so far allowing that’); and 3) it does not require as in the models presented 
by Vardi in [58] and Hart et al. in [29] that the algorithms be finite-state programs. 
(Actually our model allows for algorithms having a state space with the cardinality 
of the continuum). Our Chapters 3 and 7 illustrate the two first points. The last 
one is a simple consequence of the fact that our model allows the algorithm to have 
an infinite memory. 


In Chapter 3 we present the notion of adequate performance measures for an algo- 
rithm: measures that are “naturally” attached to an algorithm and provide mean- 
ingful estimations. We illustrate this notion in our analysis of Rabin’s algorithm 
for mutual exclusion [49]. We furthermore provide a rather general technique for 
proving rigorously high-probability statements holding against all adversaries. 


In Chapter 4 we provide a rather general technique for proving upper bounds on 
time for randomized algorithms. We illustrate this technique in our analysis of 
Lehmann-Rabin’s Dining Philosopher’s algorithm [37]. 


In Chapter 6 we present a general proof methodology to attempt to prove that a 
given algorithm is optimal. We illustrate this method in Chapter 7 with a specific 
example. 


"The model developed by Graham and Yao is an ad-hoc model for the specific situation per- 
taining to [25]. 


Chapter 2 


A General Model for 
Randomized Computing 


We argued in Chapter | that formal proofs of correctness of randomized algorithms 
required a formal model for randomized computing. Such a model should formalize 
the notion of algorithm and of adversary, and formalize how these two entities 
interact. 


We also argued that the game-theory point of view was most appropriate to under- 
stand and model these notions. For this, we let Player(1) be the entity selecting 
the algorithm and Player(2) be the entity selecting the adversary. In this language, 
algorithms and adversaries are strategies of respectively Player(1) and Player(2). 
We therefore sometimes call Player(1) the algorithm-designer and Player(2) the 
adversary-designer. If Player(1) selects the algorithm a and if Player(2) selects 
the adversary A, the game played by the two players consists of the alternative ac- 
tions of the algorithm and the adversary: Player(1) takes all the actions as described 
by a until the first point where some choice has to be resolved by the adversary; 
Player(2) then takes actions to resolve this choice as described by A and Player(1) 
resumes action once the choice has been resolved ... This means that the two players 
play sequentially. 


The purpose of this chapter is to construct both 1) such a general model for ran- 
domized computing and 2) the associated probability spaces used for the analysis of 
randomized algorithms. Our model is presented in Section 2.3. The construction of 
the associated probability spaces is presented in Section 2.4. Section 2.1 investigates 
the features that a general model should be endowed with. Section 2.2 motivates 
the formal presentation given in Section 2.3. 
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2.1 Which Features should a General Model have? 


We argue here that a model for randomized computing should have the following 
general features. 1) It should not only allow to analyze the performance of a given 
algorithm but the performance of a whole class of algorithms. 2) It should formalize 
the notions of both an adversary and of an algorithm: for emphasis, the algorithms 
and adversaries thus formalized are called admissible. 3) It should allow the adver- 
saries to be randomized. 4) It should allow to formalize that both Player(1) and 
Player(2) have in general only a partial knowledge of the state during the execution. 
In particular it should provide an explicit mechanism of communication between the 
two players allowing them to update their knowledge as the execution progresses. 
And 5), an admissible adversary should be characterized independently of the choice 
of any specific admissible algorithm. 


1. & 2. Concurrent analysis of several algorithms; Formalization of the 
notion of algorithm. As mentioned in Chapter 1 most of the existing models for 
randomized computing (e.g. [3, 28, 29, 47, 54, 58]) are implicitly designed for the 
analysis of one algorithm. In contrast we have in mind a general model in which all 
proofs and arguments about randomized algorithms could be expressed. Our model 
must in particular be suited for the formalization of lower-bound proofs. In that case 
the analysis considers not only one, but a whole family II of algorithms. This family 
must be characterized much in the same way as the family A of adversaries must be 
characterized. Note that characterizing the family II corresponds to modeling the 
notion of algorithm. In all the papers cited above and analyzing a fixed algorithm, 
the necessity to model an algorithm was not felt: more exactly the description of the 
algorithm analyzed was the only modelization required. This situation corresponds 
to the case where II is reduced to a singleton {7}. 


3. Randomized adversaries. We now turn to the third point and argue that a 
general model should allow the adversary to be randomized. 


It is often claimed that, in the absence of cryptographic hypotheses, the analysis can 
“without loss of generality” consider only non-randomized adversaries. For instance 
in [29] the authors say and we quote: 


“Note that the schedule’s “decisions” are deterministic; that is, at each 
tree node a unique process is scheduled. One might also consider more 
general schedules allowing the schedule to “draw” the process to be sched- 
uled at each node using some probability distribution (which may depend 
on the particular tree node). However ... there is no loss of generality 
in considering deterministic schedules only.” 
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The reason is that, whenever making a random choice among a set D, an optimal 
Player(2) can instead analyze the outcomes corresponding to all the choices d in 
D, (more exactly to all the measurable choices), and choose one that best lowers 
the performance of the algorithm. (More exactly, can select one that brings the 
performance of the algorithm arbitrarily close to the infimum “inf yep performance 
under choice of d”.) 


The above argument shows that a model allowing the adversaries to be randomized 
does not make Player(2) more powerful. Nevertheless we do not share the more 
restrictive view expressed in [13]: 


“The motivation for models where the adversary is assumed to exhibit 
certain probabilistic behavior is that the worst case assumptions are fre- 
quently too pessimistic, and phenomena like failures or delays are often 
randomly distributed.” 


Randomization can undoubtedly be useful in a setting where one wants to weaken 
the power of Player(2). But, even when considering an “unweakened” Player(2) 
allowing Player(2) to use randomization can be of inestimable help in the proof 
of optimality of a given (optimal) randomized algorithm a. Indeed, the proof 
methodology — for establishing the optimality of a randomized algorithm — presented 
in Chapter 6 of this thesis requires to provide a specific adversary and to prove 
that this adversary satisfies an optimal property’. (Recall that an adversary is a 
strategy of Player(2).) As just mentioned, randomization does not make Player(2) 
more powerful and the existence of such an optimal randomized adversary therefore 
implies the existence of a deterministic optimal adversary. But, in general, the sole 
description of such a deterministic optimal adversary can prove to be a very hard task 
(e.g., taking non constant space — even taking exponential space), therefore barring 
the possibility to establish the optimality of an algorithm — at least using our proof 
methodology. On the other hand, if the adversary is allowed to use randomization, 
we can in some instances provide the description of a “simple” optimal randomized 
adversary, thus also proving the optimality of the algorithm 7». 


An illustration of this phenomenon is given in the proof of optimality given by 
Graham and Yao in [25]. A close analysis of the proof of [25] shows that it follows 
the general methodology given in our Chapter 6, introduces a specific adversary 
Ap and proves that it verifies an optimal property. A fundamental assumption 
of the model used in [25] is that the Player(2) “knows” explicitly the algorithm 
under use. This knowledge is used critically to define the strategy Aj: at every 
point of the execution, Player(2) determine its next step by emulating 7 under 


‘The desired notion of optimality for the adversary will be clarified in Chapter 6. 
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certain conditions. As a is randomized, the emulation of 7 also requires the uses of 
randomness and Ag is therefore a randomized adversary. 


This discussion justifies that a general model should allow the adversary to use 
randomness. We will provide in Chapter 7 of this work another application of this 
fact. 


4. Formalization of the notion of knowledge. We now turn to the fourth 
point and argue for a model formalizing the notion of knowledge. The model should 
also provide an explicit mechanism of communication between the algorithm and 
the adversary, allowing them to update their knowledge as the execution progresses. 


It should be intuitively clear that an optimal strategy of Player(2) is one where 
Player(2) optimizes at each of its steps the knowledge it holds so as to take decisions 
most detrimental to the performance sought by Player(1). Similarly, in cases where 
Player(1) has more then one strategy a to choose from, (i.e., in cases where more 
then one algorithm can be considered), the best strategy is one that takes best 
advantage of the knowledge available about the past moves of Player(2) and about 
the strategy implemented by Player(2). 


Establishing the performance of an algorithm is always tantamount to proving that 
Player(2) adopting an optimal strategy cannot reduce the performance below the 
level of performance claimed for that algorithm. Equivalently, a proof of correctness 
must in some way establish a bound on the usefulness of the knowledge available to 
Player(2). Similarly, to establish the optimality of an algorithm one is led to show 
that no other admissible algorithm can use more efficiently the knowledge available 
to Player(1). 


This justifies that a general probabilistic model for randomized computing should 
formalize the notion of knowledge available to the players. And that it should 
provide an explicit mechanism of communication between the players, formalizing 
the update of their knowledge as the execution progresses. 


5. An adversary can be associated to any algorithm. We now turn to the 
fifth point and argue that, in a general model for randomized computing, an ad- 
missible adversary should be characterized independently of any specific admissible 
algorithm. Note first that, by definition, an algorithm 7z is defined independently of 
a given adversary A. On the other hand, an adversary might seem to be defined only 
in terms of a given algorithm: the adversary is by definition the entity that resolves 
all choices not in the control of the algorithm considered. We show over an example 
that this conception is incorrect and that a correct game theory interpretation yields 
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adversaries defined independently of the algorithm considered. 


A situation where one can encounter several algorithms in the course of a correctness 
proof is one where, as in Chapters 3 and 4 of this thesis, a program C is studied 
for various initial conditions s;,7 € J. One can then model an algorithm to be a 
couple (C,s;): we have a different algorithm a; for each different initial condition 
s;.” The code C is considered to “be correct” (with respect to the specifications of 
the problem considered) if all algorithms behave well against Player(2). Note that, 
in this situation, one implicitly assumes that Player(2) “knows” which algorithm 
a; is under use. A strategy A for Player(2) (i.e., an adversary) is then accurately 
defined to be a family (A;)ie7, one for each algorithm 7;. We will say that A; is 
an adversary specially designed for 7;.2 The adversary A = (A;)ier is clearly not 
associated to a specific algorithm 7;. 


More generally one will define an adversary to be a strategy of Player(2) taking into 
account allthe information available during the execution. (In the previous example 
one assumed that Player(2) was told the algorithm selected by Player(1).) We thus 
obtain a symmetric model where algorithms z in II (resp. adversaries A in A) are 
defined independently of any choice of A (resp. of any choice of 7). Because they 
are dealing only with the special case where Player(1) has only one strategy, many 
of the models published have developed in ways obscuring this natural symmetry 
between Player(1) and Player(2), i.e., between adversaries and algorithms. (The 
model presented in [40] suffers of this flaw.) This represents more than an esthetical 
loss. For, as we show in Chapter 6 of this thesis, the symmetry of the two players is 
expressed by a max-min equality and plays a central role in the proof of optimality 
of a randomized algorithm. 


Note that the properties 1, 2, 3 and 5 claimed for a general model — the possibility 
to analyze and compare several algorithms, the necessity to formalize the notion 
of an algorithm, the allocation of randomness to the adversary and the possibility 
to characterize an admissible adversary independently of any specific admissible 
algorithm — are useful mostly for lower-bounds. This explains why none of the 
models considered so far in the literature included these features. 


On the other hand, the fourth claim — that the notion of knowledge is crucial for 
proofs of correctness and should be explicitly incorporated in a model — is very rele- 
vant to upper-bounds. Proofs that do not approach formally the notion of knowledge 
are walking a very slippery path, to use a term first coined in [37] and then often 


?We will review this case in Section 2.2. 
°We will come back in more detail to this notion in Chapter 6. 
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quoted. An illustration of this fact is, as we will see, the algorithm for mutual ex- 
clusion presented in [49] whose correctness property hinges critically on the limited 
knowledge available to Player(1). 


2.2. Towards the Model 


The previous section outlined the features that a general model for randomized 
computing should be endowed with. This section analyzes in a first part various 
equivalent ways to construct a model and motivates in a second part the formal 
definition presented in Section 2.3. 


2.2.1 Equivalent ways to construct a model 


We illustrate here how equivalent models for randomized computing can be obtained 
by exchanging properties between the model used for an algorithm and the model 
used for an adversary. 


A first difficulty encountered when modeling the notions of adversary and algorithm 
is that these two notions cannot be defined independently.* As a consequence, some 
properties can be exchanged between the model used for an algorithm and the model 
used for an adversary. This can lead to develop different but equivalent models. 


We illustrate this fact with three examples. The first example expands on the 
example presented in Section 2.1 and shows that equivalent models can be achieved 
by allocating to Player(1) or to Player(2) the choice of the initial configuration. The 
second example shows that, when considering timed algorithms, equivalent models 
can be achieved by allocating to Player(1) or to Player(2) the control of the time. 
The third example shows that, in the case where Player(2) is assumed to know the 
past of an execution, equivalent models can be achieved by allocating to Player(1) 
or to Player(2) the control of the random inputs to be used next by Player(1). 


Consider first the case of Lehmann-Rabin’s algorithm presented in [37] and studied 
in Chapter 4 of this work. 


The correctness property of the algorithm expresses that “progress occurs with 


“This is not in contradiction with point 5 above stating that “an admissible adversary should 
be characterized independently of any specific admissible algorithm”. We speak here of the way 
we define the sets II and A of algorithms and adversaries. We spoke in point 5 of the way a given 
element A in A is characterized. 
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“high” probability whenever a process is in its Trying section”. Let C be the program 
described in [37] and recalled in page 101. A possible way to model the situation is 
to consider that an algorithm 7 is defined by the conjunction (C, Sins) of C and of 
the initial configuration s;,,;,. We then derive a family of algorithms, one for each 
initial configuration s;,,;;. In this case, an adversary is a strategy guiding the moves 
of Player(2) against all such possible algorithms (C, s;,:4): by assumption Player(2) 
learns which algorithm (C, s;,:4) is selected by Player(1) at the beginning of the 
execution. Another possible way to model the situation is to assume the existence 
of a single algorithm, namely C, and to let Player(2) select the initial configuration. 


This modeling duality is similarly encountered in the consensus problem studied 
in [3], in the mutual exclusion problem studied in [49] and more generally in all 
situations where the initial configuration or input is not a priori determined. In [61] 
Yao specifically considers this situation. 


As a second example we now consider the timed version of the same Lehmann- 
Rabin’s algorithm [37]. In this situation we restrict our attention only to executions 
having the property that “any participating process does not wait more then time 
1 between successive steps”. The control of the time can be allocated in (at least) 
two different ways resulting in two different models. One way is to allocate the 
time control to Player(1): each timing policy such that no participating process 
waits more then time 1 for a step corresponds to a different admissible algorithm. 
Another solution is to assume instead that Player(2) controls the passing of time: 
an adversary is admissible if it does not let time pass without allocating a step toa 
process having waited time 1. This last solution is the one adopted in Chapter 4 of 
this work. 


We provide another less trivial example of the possible trade-off between the notion 
of adversary and the notion of algorithm. This example shows that the two models 
for randomized concurrent systems of Lynch et al. [40] and Vardi [58], page 334, 
are equivalent. We summarize here the model presented in [40]. (See Chapter 4 
and [54] for more details.) In this model a randomized algorithm is modeled as a 
probabilistic automaton: 


Definition 2.2.1 A probabilistic automaton M consists of four components: 


e a set states(M) of states 
e¢ a nonempty set start(M) C states(M) of start states 


e an action signature stg(M) = (eat(M), int(.M)) where ezt(M) and int(M) are 
disjoint sets of external and internal actions, respectively 
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e a transition relation steps(M) C states(M) x acts(M) x Probs(states(M)), 
where the set Probs(states(M)) is the set of probability spaces (Q,G, P) such 
that 2 C states(M) and G = 2". 


An execution fragment a of a probabilistic automaton M is a (finite or infinite) 
sequence of alternating states and actions starting with a state and, if the execution 
fragment is finite, ending in a state, @ = 591, 5,d952---, where for each 2 there exists 
a probability space (Q,G, P) such that (s;,a;41,(Q9,9, P)) € steps(M) and 54, € 2. 


An adversary for a probabilistic automaton M is then defined to be a function A 
taking a finite execution fragment of M and giving back either nothing (represented 
as L) or one of the enabled steps of M if there are any. 


A quick interpretation of this model is as follows. A step (s,a,(Q,G, P)) in steps(M) 
represents a step of the adversary. During an execution, a step (s,a,(Q,G, P)) 
can be selected by the adversary as its t-th selection only if the state s,_, of the 
underlying system is s. (Note that this condition implicitly assumes the existence of 
a mechanism allowing Player(2) to “know” precisely what the state of the system 
is. Furthermore the definition of the adversary as a function of the whole past 
execution fragment means that the Player(2) “remembers the past” and chooses 
the successive steps based on this knowledge. This precise model therefore does not 
apply to cases where, as in [49], Player(2) does not have access to the full knowledge 
of the state of the system.) The second field a and the third field (Q,G, P) of the 
step of the adversary characterize the step to be taken next by the algorithm: this 
step corresponds to the action a and consists in choosing randomly an element s, in 
Q according to the probability distribution P. 


Consider the case where, for every (s,a) in S x A, there is a fixed probability space 
(Q..a+9s,a, Psa) such that, for every step in steps(M), if the state is s and the 
action is a, then the associated probability space is (Qs a,Gs,a, Ps,a)-” In this case, 
we can change the model of [40] into an equivalent model by making the following 


changes. We model a randomized algorithm to be a family (Qs a> Gs as Pra) ¢, ayesxA 


of probability spaces. Following the idea presented in [58], page 334, we redefine 


°The model presented in [40] does not always assume this fact: this model is a hybrid where the 
adversary does decide at each step what is the probability space to be used next by the algorithm, 
but where this space might not be uniquely characterized by the action a selected. This means in 
essence that the set of actions, sig(M), is not big enough to describe accurately the set of decisions 
taken by Player(2). In order to do so we need to refine the set of actions. Doing this allows one 
to get a one-to-one correspondence (s,a) € S x A = (Qs,a,Ge,a, Ps,a) and hence reduces the model 
of [40] to our situation. 
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the notion of adversary by saying that an adversary is a function A taking a finite 
execution fragment 59,1, 5,... of M and giving back either nothing (represented 
as L) or one of the enabled actions of M if there are any. (This action is said to be 
“decided” by the adversary.)° Furthermore, if Player(2) decides action a while the 
state is equal to s then the algorithm takes its next step by choosing randomly an 
element s in 2, according to the probability distribution P,,. Hence the family 
(Qs as Gs.a5 Pra) sa)esxa of probability spaces can be interpreted as being the local 
dynamics of the algorithm. In this model an algorithm is therefore identified with 
its local dynamics. We will expend on this theme later in Section 2.3. 


It is easy to convince oneself that this model is equivalent to the one of [40]. This 
provides another example of a possible trade-off in the model of the algorithm and 
in the model of the adversary: in essence, the difference lies in whether Player(1) or 
Player(2) characterize the probability space (Q,G, P) — the local dynamics — to be 
used next by the algorithm. In [40] the local dynamics of the algorithm are specified 
in the steps taken by the adversary. In the alternative model that we just outlined 
these local dynamics are given separately and define the algorithm. 


To summarize, the previous discussion illustrates the fact that a model for the anal- 
ysis of randomized algorithms requires the simultaneous modeling of an algorithm 
and of an adversary; and that various equivalent models can be derived by trading 
some properties between the model of an algorithm and the model of an adversary. 
Furthermore, in our discussion about the model of [40], we showed how we could 
define an algorithm by its local dynamics. We also saw that a limitation of this 
model is that it pre-supposes that Player(2) “knows” completely the state of the 
system. This does not fit situations as the one encountered in Rabin’s algorithm for 
mutual exclusion that we study in Chapter 3 of this work, where Player(2) has by 
assumption only a partial knowledge of the state of the system. 


©The model considered by Vardi in [58] does not have a set of actions but solely a state space: the 
evolution of the system is defined by a concurrent Markov chain. We show here that our redefined 
notion of adversary coincides with the notion of scheduler in [58]. 

In [58], from a given state u, a step of the scheduler determines fully the next state v. This state 
determines in turn the probability distribution to be used for the next step. 

On the other hand, in our modified version of [40], the adversary determines the next action a. 
This, along with the previous state s determines uniquely the probability distribution to be used 
next. 
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2.2.2 Motivations for Section 2.3 


Recall that, if Player(1) selects the algorithm a and if Player(2) selects the ad- 
versary A, the game played (i.e., the unfolding of the execution) consists of the 
alternative actions of the algorithm and the adversary: Player(1) takes all the ac- 
tions as described by a until the first point where some choice has to be resolved by 
the adversary; Player(2) then takes actions to resolve this choice as described by A 
and Player(1) resumes action once the choice has been resolved... This means that 
the two players play sequentially, each move being randomized. Consider a player’s 
random move and let R denote the outcome of this random move: R is a random 
variable. Let (Q,G) be the space where R takes values and let P be the probability 
law £(R) of RB.” 


Recall also that our goal is not to construct a computational model that would 
describe the implementation of randomized algorithms, but, instead, to derive a 
probabilistic model allowing their analysis. In this perspective, two different imple- 
mentations of a player’s move leading to the same probabilistic output are indis- 
tinguishable: in probabilistic terms, random values having the same probabilistic 
law are indistinguishable. This means that the probability space (Q,G, P) gives all 
the probabilistic information about the move of the player and we can for instance 
assume that this move consists in drawing a random element of 2 with probability 
Ps 


By definition, a strategy of a player is a description of how the player is to take all 
its moves. Each move being described by a probability space (Q,G, P), a strategy 
can be modeled as a family (Q2,G+, Pr )eex of probability spaces. The set X is the 
set of different views of the system that the player can hold upon taking a move. 
This set depends on the assumptions done about the information conveyed to the 
player during the course of an execution, and about the memory of the past moves 
allowed to this player. 


To motivate this notion we discuss quickly the case of Rabin’s randomized algo- 
rithm for mutual exclusion presented in Chapter 3, the case of Lehmann-Rabin’s 
randomized dining-philosophers algorithm presented in Chapter 4 and the case of 
the randomized scheduling problem presented in Chapter 7. The reader might not 
be familiar with these problems at this point. Nevertheless, even in a first reading, 


"See Definition 8.1.2, Page 198, for a definition of the law of a random variable. 

® There is a little subtlety here. “Drawing a random element w of Q with probability P” does 
not imply that, for every w € Q, the singleton {w} is measurable, i.e., is a set in G. (For instance, 
in the case where @ is the only element of G having zero P-measure, if B is an atom of G (see 
Definition 8.1.1) and if w is in B, then {w} € G only if B = {w}.) Nevertheless this will be the case 
in the sequel as we will assume that Q is countable and that G is the discrete o-field P(Q). 
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the following description provides a good intuition of the issues involved with the 
notion of view. 


In Lehmann-Rabin’s algorithm [37], Player(2) knows at each time the full state 
of the system and remembers the past execution fragment. Its set X of views 
is therefore the set of execution fragments. (The notion of execution fragment is 
formally presented in Chapter 4.) By contrast, in Rabin’s randomized algorithm 
for mutual exclusion [49], Player(2) does not see the full state of the system, but, 
instead, only sees the value of the program-counter pe, of each process 2: in particular 
Player(2) does not see the values of the various program variables.° Here also it 
remembers everything it learns. The set of possible views held by Player(2) upon 
making a move is therefore the set of finite sequences (7%), pe;,,..-, tn, pe, ), where 
(21,...,%) is a sequence of process-id’s. 


Consider now the case of Player(1) (for the same algorithm [49]). After each of its 
moves, Player(1) only remembers the current state s of the program: s is determined 
by the values of the program counters and of the program variables. Before each of 
its moves it furthermore learns from the adversary which process 7 is to take a next 
step. Its set of views is therefore the set [ x 5, where J is the set of process-7d’s and 
S is the state space of the system. (We can assume that the field 2 of the view of 
Player(1) is erased, i.e., reset to #, after each move of Player(1).) 


In the scheduling problem of Chapter 7 the two players play the following game. 
At each (discrete) time t, Player(1) selects a set s; of n elements from {1,...,n}. 
Then Player(2) selects an element f, from s;. We assume that Player(2) learns the 
choices of Player(1) (and remembers everything). Its set of views is therefore the set 
of finite sequences 5, fi, 52, fo,... By contrast, Player(1) learns none of the moves 
of Player(2) and its set of views is the set of sequences 51, 52,... 


The intuition behind the notion of view should be clear: a player with a restricted 
knowledge of the state of the system can sometime hold the same view «x of two 
different states. Being unable to distinguish between these two states the player 
therefore uses the same probability space (0,,G,, P,) to take its next move. 


In the sequel we distinguish between the views of the two players: we let X denote 
the set of views of Player(1) and Y the set of views of Player(2). 


The previous examples illustrate a general principle about the update mechanism 
of the views held by the two players: when taking a move a player changes the 


°As shown in the quote presented in page 18, in [49] Rabin actually defines a schedule to be 
a function on the set of finite runs. Equivalently, the assumption of [49] is that Player(2) knows 
the past run of every execution. Nevertheless, a careful analysis of the algorithm shows that this is 
equivalent to having Player(2) knowing the values of the program counters of the processes. 
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state of the system. Both players in general only learn partially the nature of the 
change and update their views as prescribed by the specification of the problem. 
Our model will introduce two functions f and g formalizing how the move of either 
player modifies the state of the system and the views held by both players. Let us 
emphasize that f and g formalize the acquisition of all information by either player. 
This applies in particular to the situation considered by Graham and Yao in [25] 
where Player(2) learns the strategy a selected by Player(1). 


The previous discussion allows us to present in the next section a general model for 
randomized computing. 


2.3. A general Model for Randomized Computing 


2.3.1 The model 


We argued in Chapter 1 and in Section 2.2 that a formal model for randomized 
computing should model simultaneously the notion of algorithm and of adversary 
and should allow for the consideration of several algorithms; that the notion of 
adversary should be modeled independently of the specific algorithm chosen; that 
the adversary should be randomized (i.e., allowed to use randomization). 


We argued also that the situation encountered while modeling algorithms and adver- 
saries was best described in terms of game theory: Player(1) is the entity selecting 
an (admissible) algorithm. Player(2) is the entity selecting an (admissible) adver- 
sary. An algorithm is a strategy of Player(1), whereas an adversary is a strategy of 
Player(2). These two players take steps in turn, following their respective strategies. 


Using this language of game theory, we argued that a precise mechanism should be 
introduced, describing how the state of the system evolves and how the views of the 
two players are affected when one of the two players takes a move. 


This discussion leads to the following definition. 


Definition 2.3.1 An algorithm/adversary structure for randomized computing is a 
tuple 
(5, Xx, Y, Yinits f, g, I, A) 


having the following properties: 


e S is a set called the set of states 
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e X isa set called the set of views of Player(1). By assumption L is an element 
not in X. 


e Y is a set called the set of views of Player(2) 
© Yinit 1S the initial view of Player(2). By assumption Yin is not inY. 


e Il = {thier is a family of elements called algorithms. Each algorithm 1; 
is itself a family of probability spaces, one for each view x in X: a = 


(Qa, Gr is Py i )wex . 


© A= {Aj}jez is a family of elements called adversaries. Each adversary A; 
is itself a family of probability spaces, one for each view y in Y U {yin}: 
Aj = (Qy 55 Gy,55 Py j yevutyina} By assumption, L is an element of Q,; for 
ally EY andj € J. 


ef: SxXxYx (Urexier 0,3) = 5x XxY is the update function associated 
to Player(1). 


eg: SXXX(Y U {Yinith) X (Qyev jer Qy7) = 9X X XY is the update function 
associated to Player(2). The function g is such that, for every s and s' in S, x 
and x in X, and a in Urex jer Qe, we have g(8,2, Yinit, 4) = g(S', 2", Yinit, @)- 
For every s,x and y, g(s,v,y,l)=(1,1,4). 


Abusing language, we will sometimes find it convenient to refer to a (5S, X,Y, Yinit, 
f,g, Il, A) structure as simply a I[/A-structure. This is very similar to the abuse 
of language committed by probabilists when speaking of a probability P without 
mentioning the underlying o-field. 


We now provide further justifications and reiterate some comments to the previous 
definition. 


We will describe in Section 2.4 how an algorithm/adversary structure for randomized 
computing defines a set of executions. In doing so we will assume without loss of 
generality that Player(2) takes the first move and that its initial view is an element 
Yinie not in Y: we can assume this without loss of generality because we can always 
add some dummy moves at the beginning of the game if necessary. The condition 
g(S,2%, Yinit, @) = 9(5', 2’, Yinit, @) for every s and s’ in S, « and a’ in X, and a in 
Unexier Mri, ensures that the first move of Player(2) is independent of the values 
the state s and the view x might originally hold. For convenience we will let g(Yinit, a) 
the value common to all g(s, 2%, Yinit, @). 


Let us emphasize that, eventhough seemingly restrictive, our choice of initial condi- 
tions is very general. In particular our model allows to express that “a randomized 
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algorithm must behave correctly for a family Init of different initial states”. (This is 
in particular the case of Lehmann-Rabin’s algorithm which we analyze in Chapter 4.) 
Indeed we model such a situation by enforcing that the first move of Player(2) con- 
sists in choosing one of the states s in Init. The subsequent moves then correspond 
to the “normal” exchanges between algorithm and adversary for the initial state 
s selected by Player(2). Hence, in such a model, the fact an algorithm “performs 
well” against all adversaries encompasses that it does so for all initial inputs. This 
example illustrates the power of the game theory setting that we adopt. 


As discussed in Section 2.2, the purpose we seek in a model for randomized comput- 
ing is to analyze randomized algorithms and not to describe them. As a consequence, 
eventhough a move of a player could be implemented in a variety of ways and involve 
several sequential instructions, we are only interested in the probability distribution 
of its output. This explains why we can assimilate a move to a probability space 
(Q2,G,P) and assume that, in order to take this move, the player draws a random 
element of Q with probability P. 


A randomized algorithm is a strategy of Player(1), i.e., a description of how Player(1) 
is to take all its moves. Each move being described by a probability space (Q,G, P), 
a strategy 7; can be modeled as a family (0Q23,G.:, Pri )vex of probability spaces. 
X is the set of views that Player(1) can have of the system: Player(1) can act 
differently in two moves only if its views are different. This explains why a strategy 
mT; is associated to a different probability space for each different view x. This 
justifies the definition of an algorithm given in Definition 2.3.1. 


As discussed in Section 2.2, an adversary should similarly be randomized and, in 
general, similarly allowed to have only a partial view of the state of the system. 
This justifies the definition of an adversary A; as a family of probability spaces 
(Qy559yj3>PyjJyey. The randomization of the adversary will be used crucially 
in Chapter 7. The restricted view of the adversary is a crucial feature of Rabin’s 
algorithm for mutual exclusion studied in Chapter 3. 


As mentioned in Page 32, the symbol L is meant to signify that Player(2) de- 
lays forever taking its next move and that the execution stops. We check that 
our formalism is accurate. Assume that Player(2) selects L. By assumption 
g(s,v,y,L) = (1, L, 1) so that the view of Player(1) is overwritten with L. As L 
is not in X, Player(1) does not take any move and the execution stops. 


For emphasis, we sometimes call A the set of admissible adversaries. Similarly II is 
the set of admissible algorithms. In the case where we analyze a single algorithm 7 
we have I = {7}. This is the case of Chapter 3 and Chapter 4 of this work. On 
the other hand, in Chapter 7, we will prove the optimality of an algorithm within 
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an infinite class II of algorithms. 


The function f (resp. g) is the update function expressing how the state and the 
views of the two players evolve when Player(1) (resp. Player(2)) takes a move: if 
the state is s, if the views of the two players are x and y, and if a move a in Q,; is 
selected by an algorithm 7;, then the new state is s’ and the views of the two players 
are changed into 2’ and y’, where s’, a’ and y’ are defined by f(s, 2, y,a) = (s’,2’,y’). 
Similarly, if the state is s’, if the views of the two players are 2’ and y’, and if a 
move a’ in Q,; is selected by an adversary A;, then the new state is s and the 
views of the two players are changed into x and y, where s,x and y are defined by 
g(s',a',y',a’) = (s,2,y). 

Note that we imposed the function f to be defined on the cartesian product $x X x 
Yx (Urexier (),,;) only for simplicity of the exposition. We could for instance reduce 
its domain of definition to the subset {(s,2,y,a);s € S,x Ee X,y € Y, a € User Dz: }. 
The domain of definition of g could similarly be reduced. As we will see in the 
next section, these variations do not affect the probabilistic structure on the set 


of executions derived from the model. Also, we could easily generalize our model 
to the case where, for every (s,z,y), a move a would lead to a randomized new 
configuration (s',2’,y’). This situation would correspond to an environment with 
randomized dynamics. For simplicity we consider here only situations where the 
environment has deterministic dynamics. 


The model of Definition 2.3.1 expends on the idea presented in Page 33 and defines 
algorithms and adversaries by providing their local dynamics. Indeed each probabil- 
ity space (Q,.;,Gr;, Pr) describes how the algorithm 7; selects its next move when 
its view is x. And the function f describes what the state and the views evolve into 
after each such move. The adversary is defined through symmetric structures. 


2.3.2 Special cases 


We now mention two special cases which will play an important role in the sequel. 


The analysis of a single algorithm 7» corresponds to the case where II is equal to 
the singleton {7 }: Player(1) has only one strategy. 


A second important case is when the strategies of Player(2) are all determinis- 
tic. This corresponds to the situation where every admissible adversary A; = 
(Q,5,G,,j3, Py) is such that all the measures P,; are Dirac measures. In the case 
where the o-fields G, ; are discrete this means that for all 7 and y, there exists some 
point w,; in Q,; such that P, ;[w,;] = 1. 
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As mentioned in Chapter 1, randomization does not make the adversary more pow- 
erful and we can without loss of generality assume that all admissible adversaries 
are deterministic when a single algorithm is considered. (Or more generally, when 
II is finite.) This fact is largely acknowledged and explains that only determinis- 
tic adversaries are considered in the literature interested in the analysis of a given 
algorithm. (Hart et al. [29], page 359 and Aspnes-Waarts [4], section 5 mention 
explicitly that fact. See also, section 4 of [58] devoted to probabilistic concurrent 
programs, where Vardi models schedulers as deterministic entities. ) 


We are now ready to address the question raised in Section 2.2 and characterize 
the global probability space(s) whose sample space is the set executions and whose 
probability distribution is induced by the local dynamics of the two players. 


2.4 The Probability Schema associated to a I[/A-Structure 


Consider an algorithm /adversary structure (in short, a I[/A-structure) (9, X, Y, Yinit, 
f,g, Il, A) as presented in Definition 2.3.1. 


The purpose of this section is to define for each algorithm az in Il and each adversary 
A in A a probability space (Q,.4,G;7,4, Pra) whose sample space 2,4 is the set of 
executions generated by and A and whose probability measure P,4 is “induced” 
by the local dynamics of 7 and A. 


We need first to define the sample space 2, 4 and the o-field G,.4. The precise 
sample space 2, 4 we have in mind is the set of “maximal” executions, a maximal 
execution being either an infinite execution or a finite execution such that the last 
move of Player(2) is L. 


The o-field G,4 we have in mind is the smallest one allowing to measure (in a 
probabilistic sense) all the sets of executions “defined by finitely many conditions”. 


In order to carry out our construction we need to make the following assumption. 


Assumption: The probability spaces (Q.%,G2i, Pri) and (Qy7,Gy5,Py;) defining 
the algorithms in II and the adversaries in A have all countable sample spaces. The 


associated o-fields G, ; and G,; are the discrete o-fields P(Q,;) and P(Q, ;). 


(Let us mention that the requirement that all the o-fields are discrete is not restric- 
tive: if this were not the case we would adopt an equivalent probabilistic model by 
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changing all the sample spaces and taking only one point in every atom.!° Also, we 
could relax the hypothesis of countability of the sample spaces by requiring instead 
that all the probability measures P, ; and P,; have countable support.) 


Let 7; = (O23, Ge, Pri )oex and Aj = (Qy5,Gy;, Py j yey be two elements in IT and 
A respectively. A (1;, A; )-execution w is a (finite or infinite) sequence of alternating 


actions a and triples (s,z,y) of states and views ending, if the execution is finite, 
with the state-view triplet (1, 1,1), 


W= a, (s1,2%1,%1) ag (82, %2, Yo) a3 (83, %3, Y3) see 
—_—_—_ on een Re.” 


Player(2) Player(1) Player(2) 
and having the following properties: 


1. ay € Qy,, 4,7 and (81,21, 91) = 9(Yiniv, 41) nh 


2. for every even k,k > 1, ay4i1 € Qy,,; and (Skis Pktis Yeti) = G(Sks Chk» Yes e421) 


3. for every odd kik > 1, ay41 € Qe,,; and (Sk4isUe4is Yeti) = Ff (Sky 2ks Yes Ue41): 


A (a;,A;)-execution-fragment is a finite execution-prefix terminating with a state- 
view triplet. For every (7;,.A;)-execution-fragment w, we define the length |w| 
of w to be the number of actions a@ present in w. For instance the length of 
1 (81,21, Y1) G2 (82,2, Ya) is 2. We define: 


Q7,A; = {w;w is a (;,A;)-execution}. 


We now turn to the formal definition of Gri A;- We endow the sets $,X and Y 
with the discrete o-fields P(S), P(X) and P(Y). We define on Q,, 4, the functions 
Ap, Sk, Xn, Vai k > 1, by setting A,(w) = ay, Sp(w) = sp, X,(w) = a, and Y,(w) = 
yp for every w = a1 (51,21, Y1) G2 (S2, 22, Ya)... of length at least 7. We extend the 
definition and set A,(w) = S;,(w) = X;(w) = ¥,(w) = Lif lw] <k. 


We define G,, 4, to be the smallest o-field making all the functions Ay, $,,X%, Ye; k = 
measurable: 


Gr,a, = 7(Ag, Se,Xu, Va; k > 1). 


Note that, as for every & the triple 5,, X,, Y; is defined deterministically in terms 
of Ay, Sp_1, Xp_1 and Y;_1, we also have that 


Gr; A; = o(Ayik 2 1) : 


See the Definition 8.1.1, page 198, for a definition of an atom. 
11See the discussion on page 37 for a discussion of the special case y = init. 
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For the same reason we could have more simply defined an execution w to be the 
sequence 
W= a, 42a3... 


in place of the sequence 


W= ay (si, 1, Y1)ay (S9, t2, Yo ag (s3, 3, Ys) see 


We will adopt this simplified definition in Chapter 7. Our slightly redundant def- 
inition has the advantage of making explicit states and views which are important 
parameters. 


Recall that, by assumption, all the o-fields G,; and G,; are the discrete o-fields 
P(Q,;) and P(Q,;). This implies that for every a in 02,; or Q,; the singleton 
{a} is in G,; or G,;, respectively. Hence, for every sequence (a),...,a;), the set 
{A, = a,...,A, = a,} is measurable in G,,,A;- This allows us to present the 
following equivalent definition of G,, .4,. For every (7;,.A;)-execution-fragment a we 
define the rectangle Ry to be the set of (7;,A;)-executions having a as their initial 
prefix. Then 
Gr,A, = 7(Rai ais a (m;, A; )-execution-fragment) . 

This formalizes the claim formulated at the beginning of this section that G,, 4, is 
the o-field allowing to measure probabilistically all the sets of executions defined 
“by finitely many conditions”. 


We now turn to the definition of the global probability measure P,,.4,. We want 
this measure, if it exists, to be compatible with the local dynamics defining 7; 
and A;. This means that, if a = a; (51,21, Yi) d2(S2, 9, Yo)... Gy (Sm, 2K, YR) IS a 
(7;, A; )-execution-fragment, then 


PP, A [Rol = Pyinij li] Pes ide] Pyoj las] --- Pex, alan]. 


5 Yinitsd Y2sd 


where, for the sake of exposition, we assumed that & was even. We now define a 
filtration (Gi )e>1 of (Qp,.4;,Gr:,A;)> Le., an increasing sequence of sub-o-fields of 
Gx,,A;- For every k,k > 1, we set 


G, = o(Ro; a is a (m;, A; )-execution-fragment, |a| < k). 


The o-field G;, should be more appropriately called G,;;,. We drop the indices 7 and 
j for simplicity. Then, for every & > 1, setting 


Pi [Ra] = Pring [@i] Poy i[@2] Py,,j [a3]... Pep, sl@n’] 


Yinits) Y2sd 
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for every (7;,.A;)-execution-fragment a of even length k’ < k, and similarly setting 


Py [Ro] = Pyinisl@i] Pos ile] Py. g las] -- Py lan] 
for every (7;,A; )-execution-fragment a of odd length k’ < k, defines a probability 
measure P, on G,. The probability measures (P;,),>1 are compatible in the sense 
that, if k < 1 are two integers then P,[R.] = P)[R.] for every (7;, A; )-execution- 
fragment a of length at most &. Equivalently, the measures P;, and P, coincide on 
G;,. Therefore, by Kolmogorov’s extension theorem, (see for instance [56], page 161— 
163), there exists a (unique) probability measure P defined on the o-field o(U,G; ). 
As o(UsG) = G,,,a, we have therefore established the existence of a probability 
measure P defined on (Q,,.4;,Gx;,A;) and compatible with the local dynamics defin- 
ing 7; and A;. This measure P is the measure P,, 4, we were after. 


This finishes to characterize the global probability space (Oy, .4,,Gn,,4;5 Pr,,A;) Com- 
patible with the local dynamics attached to an algorithm 7; in II and an adversary 
A; in A as defined in a II/A structure (5, X,Y, yinit, f, g, I, A). The construction 
required that the probability spaces (Q,;,G.i, P,;) and (Q,;,G,,;, Py) defining the 
algorithm a; and the adversary A; have all countable sample spaces. 


The analysis of a II/A structure requires the simultaneous consideration of all the 
probability spaces (Q;,.4;,Gz,,4;» Pr,,a,). Correspondingly, the notion of event has 
to be modified to take into account the various choices of strategy made by Player(1) 
and Player(2). The next definition summarizes these facts. 


Definition 2.4.1 The family (QO; .4,G2,4,Pr.a)(x,Ajetxa 1 called the probability 
schema attached to the algorithm/adversary structure (5, X,Y, Yini, f, g, Ul, A). 
An event schema B is a family B = (By A )(n, AEX A> where Bz 4 © Gra for every 
(7,A) € IL x A. A variable-schema X is a family X = (Xy,4)(n,ayertxa, Where, for 
every (7,A) € IL x A, X,,4 is a random variable measurable with respect to Gra. 


Traditionally Kolmogorov’s theorem is stated for R° endowed with its Borel o-field 
B(R@). We therefore explain and justify here our use of Kolmogorov’s theorem in 
the previous construction. This discussion is technical and not fundamental for the 
rest of the work. 


Recall that, by assumption, all the sample spaces 2, ; and Q,; are countable and 
endowed with their discrete o-fields P(Q,,;) and P(Q,;). By relabelling we can 
therefore take all these spaces to be equal to N endowed with the discrete o-field 
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P(N). Hence we can take Q,,4, = Nx Nx... endowed with the product o- 
field P(N) © P(N) @ ... For each k, the o-field G, then corresponds to the set 
{AxNxNx...; A € P(N*)}. For each k the space N* is a complete (a Cauchy- 
sequence in N* is constant after a certain rank and hence obviously converging) 
separable metric space (for the topology induced from R*). As every point in N* 
is open, the o-field P(N*) is trivially generated by the open sets. Therefore the 
extension of Kolmogorov’s theorem given in [56], page 163, applies. 


Chapter 3 


Rabin’s Algorithm for Mutual 
Exclusion 


3.1 Introduction 


In this chapter we study the correctness of Rabin’s randomized distributed algo- 
rithm [49] implementing mutual exclusion for n processes using a read-modify-write 
primitive on a shared variable with O(log n) values. As we are concerned with a sin- 
gle algorithm, the remarks made in Section 2.3.2 of Chapter 2 allow us to consider 
only deterministic adversaries throughout the analysis. Rabin’s algorithm differs 
markedly with most other work in randomized computing in the three following 
ways. 


1. The correctness statement is expressed in terms of a property holding with 
“high” probability for all adversaries. Formally, such a statement is of the 
form 

i > 
inf PalWa | La] 2 a, 


where W and J are event schemas! and A’ is a subset of the set of admissible 
adversaries A. Nevertheless, in contrast with much previous works, “high” 
does not mean here “with probability one”, i.e., a is not equal to one. 


2. This correctness property is relative to the short term behavior of the algo- 
rithm and depends critically on the specific probability distribution of the 
random inputs. 


"See Definition 2.4.1, page 43. 
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3. The adversary is assumed to have only a partial knowledge of the past execu- 
tion. 


We comment here on properties 1-3. Properties 1 and 2 are related. Property 3 is 
of another nature. 


As discussed in Chapter 1, most correctness statements for randomized algorithms 
are either expressed in terms ofa high expected performance or in terms of a property 
holding with probability one. For instance the algorithms presented in [3] and [4] 
are shown to have a low worst case expected running time.” On the other hand, the 
original property claimed in [37] for Lehmann-Rabin’s algorithm is that “progress 
occurs with probability one”. 


A reason for considering probability-one statements is that they arise naturally from 
considerations of ergodic theory and zero-one laws. In particular, a feature of these 
statements is that the events whose probability is claimed to be one belong to the 
tail o-field of the executions, i.e., depend (only) on the asymptotic properties of the 
executions. Also, these properties usually do not depend on the precise numeric 
probability values associated to each random input X,. Instead, they typically 
depend only on the asymptotic properties of the sequence (X,,)n.° 


Such a setting has been exploited by authors interested in the logic and semantic 
of randomized computing. (See for instance [19, 28, 29, 47, 58].) The methods 
presented in these papers are directed at randomized algorithms whose correctness 
reflects the asymptotic properties of infinite executions, depends only marginally on 
the numeric specifications of the random inputs and, is expressed as a probability- 
one statement. 


These methods therefore do not apply to algorithms as Rabin’s algorithm whose 
correctness, as we will see, is relative to the short term behavior of the algorithm, 
depends critically on the specific probability distribution(s) of the random inputs 
and is expressed by a non trivial high-probability statement. 


?The term “worst case” refers to the adversary. The precise measure used in [4] is the worst 
case expected running time per processor. 

“The Borel-Cantelli Lemma provides a good example of a classical result of probability theory 
which depends only on the tail of the sequence of events considered. This property states that, if 
(An)nen is a sequence of independent events, then infinitely many events A, occur with probability 
one if and only if the sum , P(A,) is infinite. Assume for instance that An = {Xn = H} where 
(Xn)nen is a sequence of flips made with independent coins. (The coins are possibly all different.) 
Then Head appears infinitely often if and only if x, P[Xn = H] = co. This depends only on 
the asymptotic property of the coins and depends on the bias of the coins only through the sum 
a P[Xn = H]. 
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These difficulties significantly complicate the analysis of any randomized algorithm. 
But one of the most significant challenges encountered in the proof of correctness of 
Rabin’s algorithm [49] is to account formally for the limited knowledge granted to 
the adversary. As mentioned in Chapter 1, to our knowledge, our model of Chapter 2 
is the first to present a general and formal framework allowing for adversaries with 
limited knowledge. In the absence of such a framework, the analyses of non trivial 
algorithms as the one presented in [49] have been fraught with mistakes in two rather 
incomparable domains. 


Mistakes can happen of course in the proof of the property claimed. Such mistakes 
are not uncommon, as it is very hard to formally disentangle in a proof the combined 
effects of randomness and of non-determinism. In particular, recall from Chapter 2 
that the formal analysis of a randomized algorithm requires to work within a class 
of different probability spaces (Q4,G.4, Pa), one for each admissible adversary A. 
A proof is therefore not truly formal unless it explicitly records the presence of the 
adversary A in the probabilistic expressions involved. 


But the complications in the analysis of a randomized algorithm sometimes begin 
with the formalization of the intended correctness property, i.e., even before the 
proof itself begins. Indeed, as we will see in Section 3.2, even a simple statement 
including a precondition can lead to very delicate problems of formalization. Using 
an image one might describe this problem as a Solomon dilemma trying to ascertain 
“which of the two players should bear the responsibility of the precondition”. 


The example of Rabin’s randomized algorithm for mutual exclusion illustrates per- 
fectly the two types of difficulties and the dangers encountered with arguments not 
fully formalized. In [49], Rabin claimed that the algorithm satisfies the following 
correctness property: for every adversary, any process competing for entrance to the 
critical section succeeds with probability Q(1/m), where m is the number of compet- 
ing processes. In [29], Sharir et al. gave another discussion of the algorithm as an 
illustration of their formal Markov chains model and argued about its correctness. 


However, both papers did not write formally the correctness property claimed and 
did not make explicit in their arguments the influence of the adversary on the 
probability distribution on executions. 


We show in this chapter that the correctness of the algorithm is tainted for the 
two different reasons described above. We first show that the informal correctness 
statement claimed by Rabin admits no natural “adequate” formalization.* We then 
show that the influence of the adversary is much stronger than previously thought, 
and in fact, the high probability correctness result claimed in [49] does not hold. 


“The notion of “adequate” formalization is discussed in Section 3.2. 
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3.2. Formalizing a High-Probability Correctness State- 
ment 


As mentioned above, the complications in the analysis of a randomized algorithm 

sometimes begin with the formalization of the intended correctness property, i.e., even 
before the proof itself begins. Consider for instance a correctness property described 

informally in the following general format: “for all adversaries, if property B holds 

then property C holds with probability at least 1/2”. (The probability 1/2 is chosen 

only for the sake of illustration.) How can we formalize such a statement? In the 

sequel we will refer to C as the target property and to B as the precondition. 


Again, as discussed in Chapter 2, we know that the property B is actually formal- 
ized to be a family of events (Ba)aca (each By is an element of the o-field Gy), 
and that, similarly, C is a family of events (C4) aca. Also, the probability referred 
in the informal correctness statement depends on A and corresponds to different 
probability distributions P,. In spite of this comment we will conform to the tra- 
dition, and write for simplicity B and C instead of By and Cy and emphasize only 
when necessary the dependence in A. (The dependence in A will nevertheless be 
everywhere implicit.) On the other hand, we will find it useful to emphasize the 
dependence of the probability P, on the adversary A. This being clarified, how do 
we formalize the “if B then C” clause of the statement? 


Following well-anchored reflexes in probability theory we are naturally led to trans- 
late this statement into a conditional probability and say that, for every A, we 
compute the probability of the event Cy conditioned on By. Indeed, condition- 
ing on an event B exactly formalizes that we restrict the probabilistic analysis to 
within the set B. (The elementary definition of conditioning in terms of Bayes’ rule 
expresses that we restrict our attention to within B and consider the trace CN B 
of C in B. The denominator P[B] is there to normalize the restricted measure 
C — P[C 9 B| into a probability measure.”) We can then formalize the previous 
informal correctness statement into 


inf zea, patp}>o0 Pa[C | B] > 1/2 


°A more general definition of conditioning is as follows. In that case the “notion of restricting 
the analysis” is formalized as an orthogonal projection in an adequate setting. 

Let G,G’; G' C G be two o-fields, and let (0,9, P) be a measured space. For every function f 
in L?(dP), we define the conditional expectation E[f | G’] to be the orthogonal projection (in the 
L?-sense) of the G-measurable function f onto the subset of L? (dP) composed of G'-measurable 
functions. We recover the elementary definition in the case where f is the indicator function of a 
set C and G’ = {O, B,B, Q}: in that case, the orthogonal projection of f onto G’ is coincides with 
the intersection CM B. 
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The restriction P,4[B] > 0 expresses that we consider only adversaries A for which 
the precondition B is of probabilistic relevance. 


An example. 

We now show that, in some (frequent) cases, the dependence on A of the correct- 
ness statement can have far reaching consequences and negate intuitive properties. 
Consider for instance the following example proposed in a footnote of [11]. 


Imagine performing a random walk on a line, but where the adversary 
can stop you at any time ¢ < 100. One might like to say that: “if the 
adversary allows you to make 100 steps, then with probability at least a 
half you will have made 40 steps to the right”. 


We now formalize this statement. For all adversaries, the sample space can be taken 
to be Q = {H,T}'°°. For all adversaries, we also consider the associated discrete 
o-field 2°. The probability distributions P, are defined as follows. Let a be an 
execution-fragment (i.e., a sequence of draws). If A does not allow the occurrence 
of a (ie., if there is a strict prefix a; of a after which A allocates no steps) then 


Pala] = 0. If A allows the occurrence of a but blocks the execution at a then 
Pala] = 2-!e!+1, where Ja| denotes the length of a. If A allows the occurrence of 
a and does not block the execution at a then P4la] = 27!*!. Hence the support 


of the measure P, is exactly the set of maximal executions allowed (with non-zero 
probability) by A. We argue that this construction corresponds to an equivalent 
form of the general construction given in Chapter 2. In Chapter 2 we (construct and) 
consider a different probability space (Q4,G.4, Ps) for every adversary A: Q4 is the 
set of maximal executions under A. Here we consider instead a measurable space 
(Q,G) common for all the adversaries. Note that the two measurable structures are 
related: (Q4,G4) C (Q,G) for every A. (This means that Q4 C Q and that G4 CG 
for every A.) The equivalence between the two models is ensured by the fact that, 
in both cases, for every A the support of Py is equal to Qy. 


We define B to be the event “the adversary allows you to make 100 steps” and 
let C to be “Head comes up at least 40 times in the experiment”. Is it true that 
P,[C | B] > 1/2 for all adversaries A allowing 100 steps with non-zero probability ? 
The answer is no. For instance, let A be the adversary that stops the process as 
soon as a Head comes up. This adversary allows 100 steps with some non-zero 
probability, namely when all draws are Tail. On the other-hand, for this adversary, 
conditioned on B, the number of Heads is at most 1 with probability 1. 


But this argument only shows that the correctness property proposed does not fit 
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the algorithm, not that the algorithm is “not correct”®: the algorithm must be 
evaluated by other means. For instance, as we now show, the algorithm satisfies 


inf ea, patpj=1 Pa[C] > 1/2 


and also satisfies 
infuca Pal B > CT > 1/2 


In this last statement B = C denotes the event BUC. (B denotes the complement 
of B.) The first statement is easily proven, as, by definition, “P,[B] = 1” means 
that the adversary allocates 100 steps and that, correspondingly, 100 independent 
coin tosses are performed: 


inf PalC] = Prob[100 independent flips result in at least 40 Heads] 
A, Pa[B]=1 


> 1/2. 


The second statement is harder to prove and is actually a consequence of the equality 
infyea PalB > C] = infyea, pytaje1 Pa[C], which we now establish. Let us empha- 
size that, as we will later argue, this equality does not hold in general. The idea 
of the proof is that an adversary providing less then 100 steps in some executions 
yields a higher or equal probability P4[B = C] then some other adversary always 
allocating (at least) 100 steps. 


To argue this formally, consider an adversary A for which P4[B => C] is minimized. 
Assume the existence of an execution-fragment a (i.e., a sequence of draws) having 
length & for some & less then 100, and such that 1) Pyla] > 0 and 2), A does not 
allocate any steps after a. (We will call Stop, the set of such execution fragments 
a.) Consider the adversary A’ which is equal to A except that A’ always allocates a 
total of 100 steps if the execution begins with a. Let D denote the event “the first 
k draws do not yield a”. Note that, by definition of D and A, P4[D C B] = 1, and 
hence: 


P4[B > C] = PulB => C,D|+ PulB > C, D] = Pa[D] + PalB = C, D]. 
We have: 
Pal[B>C] = Pal[B>C|D|Pa[D]+ Pa[B > C|D]Pa[D] 


°The algorithm considered here is the trivial algorithm associated to the random walk and whose 
code is “flip a coin”: whenever the adversary allocates a step that unique instruction is performed. 
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= Py[B>C|D)Pa[D|+ Pal[B>C|D]Pa[D] (3.1) 


= Py[B>C|D]Ps[D]+ PulB = C,D] (3.2) 
< Pa[D|+ PalB => C,D] 


Equations 3.1 and 3.2 come from the fact that A’ behaves as A in D and during the 
execution fragment a: A’ behaves differently only if and once a appears, i.e., in D. 


We have therefore constructed an adversary A’ such that |Stop,,| < |Stop,|,i.e., which 
brings to termination “more” executions then A and which is such that Pa. [B => 
C] < Pa[B => C]. By iteration, we can therefore produce an adversary that we 
will still denote A’ which allocates always 100 steps, i-e., such that Py [|B] = 1, 
and whose associated probability Pa [|B => C] is no bigger then P4[B > C]. This 
justifies the following first equality. 


A, Ps[B]= 


inf Pa(C] 


A, Pa[B]=1 
1/2. 


IV 


Remark that the proof of the equality inf, P4[B > C] = inf, p,ppjs1 Pa[C] was only 
possible because of the special nature of the problem considered here. In particular 
we found it very useful that all the spaces (Q4,G,) were finite and equal for all 
adversaries and that the events B and C were true events (and not only general 
event-schemas). Also, the proof uses that, for every adversary A, it is possible to 
construct an adversary A’ having the following two properties: 


1. The adversary A’ gives probability one to B: Py [B] = 1. 


2. The adversary A’ “makes the same decisions as the adversary A along the 
executions ending in B under the control of A”. More formally, the probability 
measure Py coincides with the probability measure P, on the set {w;w € 
B and Pa[w] > 0}." 


The following modification on our example illustrates the importance of the first 
previous property. (An example illustrating the importance of the second property 
could similarly be constructed.) Assume that the game is changed so that, now, 
an execution can also stop with some probability ¢ at each step of the random 


’This property was the key for the derivation of Equations 3.1 and 3.2. 
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walk. Then, obviously, no adversary can ensure termination with probability one 
ie., for all A, Pa[B] < 1. Hence infyes; pyfpjo1 Pa[C] = o0.° On the other hand 
inf zca Pa[B > C] is obviously bounded (by 1!) and hence infyc, P4l[B > C] < 
inf zea, patpj=i Pa[C]. This inequality is general as the following proposition demon- 
strates. (We emphasize there for formality the dependence on A.) 


Proposition 3.2.1 Let B = (Ba)aea and C = (Ca)aca be two event schemas. 
Assume that {A € A; P4[B] > 0} 40. Then 
Pa[Ca | Bal < inf Pal Ba > Ca] < inf Pa{Ca]° 


inf 
AEA; Pa[Ba]>0 AEA; Pa[Ba]=1 
Proor. We abuse notations slightly by writing B instead of By and C instead of 
C4. Obviously, Ay = {A € A; P4[B] = 1} is a subset of A. Also, for every A in Aj, 
we have P,[B > C] = Pa[C]. This establishes that 

int, PalB = C] s scab teen PalC]. 

Let Ay be the set {A € A; Py[B] > 0}. If A € A— Az then, by definition, P4[B] = 0 
and hence P4[B > C] = 1. As by assumption A, is not empty, this trivially implies 
that inf vaca, PalC | B| < infyca—as P,{[B > Cc}. 


Consider now an adversary Ain Az. We have: 


Pa[B>C] = PslBuUC] 
= P,[BU(CNB) 
= P,[B)+ Pal[Cn 8B] 


= P,lB)+ PalC | B] Pa[B] 
= (1— Py[C | B))Ps[B] + PalC | B] (Pa[B] + Pa[B)) 
> PalC | B] 


This immediately implies that infyca, Pa[C | B] < infuca, Pal[B => C]. This 
inequality along with the one proved above establishes that 


int PalC | B] < inf PalB > Cl. 


®Recall that infeex f(z) = co if X is the empty set. This is justified by the fact that, by 
definition, infzex f(x) is the biggest lower bound of the set {f(x);« € X}. Hence, if X is empty, 
all numbers are lower bounds and the infimum is therefore infinite. 
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Note also the interesting general probabilistic relation, which is easily derived using 
Bayes’ rule: P[B > C] = P[C | B] if and only if P[B A C’] = P[B] P[C’], where we 
let C’ denote the event B > C. In particular, P[B > C] = P[C | B] if (B > C) is 
independent of B.° 


Which correctness measure is adequate. 

The previous discussion shows that “adequately” formalizing the correctness state- 
ment of a randomized algorithm is a not trivial problem. We proposed three possible 
ways to formalize a correctness statement presented in the format: “for all adver- 
saries, if property B holds then property C holds with probability at least ¢.” It 
is natural to wonder which of the three is most “adequate” in practice, whether 
they have different domains of application and whether some other good or even 
better measures exist. The answer to these questions has significant implications 
as it determines the benchmark to which randomized algorithms are evaluated and 
against which their correctness is decided. 


What do we mean by an “adequate measure”? The example discussed in page 49 
provides a good illustration of the problem. We showed that, for this game and 
the choices B = “100 steps are allocated”, C = “At least 40 Heads turn up” the 
measure inf 4e4;p,fpjso0 Pal[C | B] is equal to 0. The intuition behind this result 
is that the use of this measure provides Player(2) with the implicit knowledge of 
B. (See page 207 for a discussion on implicit knowledge.) Using this knowledge, 
Player(2) is then able to select a specific strategy A so as to annulate the probability 
of occurrence of C. 


To justify why this measure is not adequate we have to return to our original inten- 
tion. We can assume that we are provided with a II/A structure (9, X,Y, yini, fy 9; 
II, A) as in Definition 2.3.1, page 36, formalizing how the game is played between 
Player(1) and Player(2). (As we are analyzing a single algorithm 7, II is by defi- 
nition equal to the singleton {7}. Also, as mentioned at the beginning of the chap- 
ter, the set A of admissible adversaries contains only deterministic adversaries. ) 
We would ideally like to play the role of a passive observer, able to analyze the 
game without interfering with it. In this perspective, a measure is deemed adequate 
if its use does not perturb the experiment, i.e., the game between Player(1) and 
Player(2). 


It is clear from our discussion that the measure inf4e,;p,[sjs0 PalC | B] greatly 
affects the game: we arrive at the scene, impose the unnatural condition B and let 


°The relation PLB MC’] = P[B] P[C’] does not imply the independence of B and C’, because 


this independence also involves similar relations with the complements of B and C’. 
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Player(2) use it to its fullest. By contrast, an adequate measure would be one that 
would derive probabilistic facts about the II/A structure that would hold similarly 
if no measure was conducted. (This will constitute our definition of adequacy. As in 
physics, it is hard to formalize how a measure influences the ambient structure. The 
reason for the difficulty is by essence that we have access to the structure studied 
only by measuring it.) With this in mind we can now come back to the analysis of 
our three measures. 


Probabilistic conditioning. First note that the adversary is restricted in a mini- 
mal way by the precondition B when the measure inf ye 4; p,[ajs0 Pa[C | B] is used: 
it must just not disallow B probabilistically. Furthermore, as is discussed in Chap- 
ter 8, page 207, Player(2) learns implicitly that the execution is restricted to within 
the set B. Player(2) can take selective actions (i.e., design some specific strategy 
A) so as to take advantage of this knowledge. 


Conditioning by adversary. By contrast, if the measure inf zea, pyppj=1 Pa[C] is 
used, Player(2) is in some sense “fully responsible” to ensure that the precondition 
B happens. The only strategies of Player(2) (i.e., the only adversaries) that are 
retained are those that ensure with probability one that B happens. 


These two measures correspond therefore to two extremes cases. In the setting 
imposed by the first measure, Player(2) can use without any restriction the infor- 
mation that the executions take place in B. In the second setting Player(2) is most 
restricted and must select strategies ensuring that the executions happen in B with 
probability one. 


The third measure inf4c,4 Pa[B => C] does not define so precisely the role played 
by Player(2) in bringing the event B or in taking advantage of it. The fact that 
an execution falls in (B => C) is controlled in a mixed fashion by both the ran- 
dom choices used by the algorithm and by the choices made by Player(2). As 
discussed in the example presented in page 51, this measure corresponds to the 
measure inf ye 4; p,tpj=1 Pa[C] only under very specific circumstances. On the other 
hand neither of the two values infye4 P4l[B => C] and infyes: pypayso Pa[C | B] is 
uniformly bigger than the other. 


These considerations show that the first measure infye,: p,[ajs0 PalC | B] using 
probabilistic conditioning is an adequate measure in situations where the “nat- 
ural dynamics of the game” (i.e., the dynamics described by the II[/A structure 
(5, X,Y, yinit, f, g, Il, A)) are such that the precondition B is part of the know!l- 
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edge of Player(2).'° Indeed, as we just argued, this measure implicitly gives the 
knowledge of B to Player(2). The measure is adequate, i.e., passive, exactly when 
the dynamics ensure naturally that Player(2) holds that knowledge. A typical ex- 
ample of this situation is obtained when B represents a knowledge that Player(2) 
can have acquired during an execution. 


Similarly, the second measure inf 4e4. pyfpj=1 Pa[C] using conditioning by adversary 
is adequate in situations where the dynamics of the II/A structure are such that 
the precondition B depends solely of Player(2).'' This was the situation in the 
example presented previously in this section: Player(2) was “naturally” the only 
entity deciding the schedule. This situation — where B represents a scheduling deci- 
sion depending solely of Player(2) — provides a typical example where the measure 
inf zea, patpjs1 Pa[C] is adequate. 


On the other hand, as we saw, the fact that an execution falls in (B > C) is 
controlled in a mixed fashion both by the random choices used by the algorithm 
and by the choices made by Player(2). Our third measure infye4 Pa[B => C] is 
therefore adequate under only very special dynamics. For all practical purposes, we 
will deem it inadequate. 


The notions of “sole dependence” and “knowledge of player(2)”. 

We introduced in page 54 the notions of event-schemas that depend solely of Player(2) 
and which are part of the knowledge of Player(2). We now formalize these notions. 
Both require to use with precision the model of a II/A structure presented in Def- 
inition 2.3.1, page 36, and the description of the associated probability schema 
(Q4,G4,Pa)aca presented in page 41. Recall in particular from the construction 
given in page 41 that the random variables A,, A3, As,... are the actions taken by 
Player(2) following its strategy A, and that the random variables Ay, A4,...are the 
actions taken by Player(1) following its strategy 7.'” 


Definition 3.2.1 An event schema B = (Ba)aca is said to depend solely on 
Player(2) if, for every A € A, conditioned on the o-field o(A;, A3, As,...),'°> Ba is 


We will formalize this statement later in Definition 3.2.2. 

We will formalize this statement later in Definition 3.2.1. 

Eventhough sharing the same notation A, the notion of actions Aj, A2,... is distinct from 
the notion of set of admissible adversaries A = {A;}je7. Note also that, eventhough there is no 
subscript A to emphasize that fact, the random variables Ax, Sz, Xz, Yu;k > 1 defined in page 41 
are defined with respect to a given adversary A. 

We cannot say “conditioned on the values taken by Ai, A2,...” because there are in general 
infinitely many A, and we cannot fix all of them when conditioning. We therefore have to resort 
to the more general notion of conditioning with respect to a o-field. (See [56], page 211. See also 
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independent of the o-field o( As, Ag, Ag,...). 


Recall that G4 is by definition equal to o( A;; k > 1).'* Hence the previous condition 
is equivalent to: “conditioned on o( A,, A3, As,...), Ba is independent of Gy”. 


Note also that the independence condition of this definition is trivially verified if 
Ba can be characterized solely in terms of the actions of Player(2) ie., if Ba € 
o( Aj, As, As,...). Indeed, in that case, when conditioned on o( Aj, As, As,...), Ba 
“is a constant” and hence independent of Gy.'° This is a satisfactory fact: we 
would expect that events that depend syntactically only on the actions A,, A3,... 
of Player(2) do also “depend solely on Player(2)” in the sense of Definition 3.2.1! 
Our more complex definition takes into account that some events B might not be 
expressible uniquely in terms of the random variables A,, A3,... but still result 
in adequate measures of the type inf4ca, pyrpjsi Pa[C] when enforcing Blo We 
therefore provide now some motivations and justifications at how our definition 
captures that fact. 


As we just recalled, we introduced the notion of an event B “depending solely 
on” Player(2) to justify the adequacy of measures where Player(2) enforces the 
precondition B. The conditional independence of B and of the random choices of 
the algorithm ensures that, eventhough the definition of B might also involve the 
choices of the algorithm, A», A4,..., these choices do not influence whether B occurs 
or not. Hence the occurrence of B depends only on the way values are allocated 
to Ai, As,..., ie., on the adversary. (Recall that, in Definition 2.3.1, page 36, we 
defined an adversary to be a family A = (Q,,G,, Py yey ie., the family of laws!” of 
the random variables A,, A3,...) Therefore restricting the analysis to within B (the 
precondition of “if B then C”) corresponds to considering only adversaries ensuring 
that B occurs i.e., such that P,[B] = 1. 


This shows that the measure inf4e4; p,paj=1 Pa[C] is an adequate formalization of 
“the probability of C under condition B” if B depends solely on Player(2). 


the footnote 3 above.) 

See page 41. 

15 formal proof of this fact is as follows. The formal definition of conditional expectations 
recalled in footnote 3 expresses that, for every f € L?(dP), the conditional expectation E[f | 9’] 
is characterized by the property: V6 € L?(dP) NG’, Eléf] = Ele Ef | g']]. Hence, for every 
@ € L°(dP)NG' and every g € L?(dP), Eléfg] = E[¢ Elfg | GJ]. On the other hand, if by 
assumption f is in G’ we also have E[éfg] = Elof Elg | g']]. Hence, in that case, Vo € L?(dP) 9 
G', E[¢E[fy | 9']] = Blof Ely | 9]. This implies that ELfy | G']= fEly | 9'1 = ELf | 9'JEly | 9’ 
P-almost surely. As g is arbitrary in L? (dP) this precisely means that, conditioned on GQ’, f is 
independent of G. 

'®See our discussion about adequate measures in page 53. 

See Definition 8.1.2, page 198, for a definition of of the notion of law of a random variable. 
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We conclude this discussion about events “depending solely” of Player(2) with two 
caveats. To begin, note that even if an event-schema depends solely on Player(2), 
there might exist no strategy A such that P,[B] = 1. The reason is that the actions 
A, are random variables and hence not in the full control of Player(2). 


This brings us to the second point. A special but important case is when, as is 
the assumption in this chapter, the strategies of Player(2) are all deterministic. In 
that case, each action A, taken by Player(2) depends deterministically on the view 
Y,: Y, represents the knowledge held by Player(2) at level k. Hence, in this case, 
the condition presented in Definition 3.2.1 is equivalent to Y,, Yo,... determining 
completely B. Nevertheless it is possible that B be not characterized by a single 
variable Y;,. In that case, there is no point k at which Player(2) knows B, eventhough 
B depends solely on that player. A justification for this apparent paradox is that 
the sequence of views of Player(2) characterizes B, but Player(2) lacks at each 
single point the possibility to assess more then one value of the sequence (Y; )sen.- 
We can encounter such situations — where some B depends solely on the views of 
Player(2) but where Player(2) does not “know” it — if, for instance, Player(2) does 
not remember completely the past (i.e., if the the past values Y|,...,Y,_1 are not 
recorded in the view Y;,) or if B depends on infinitely many views Y;,. 


Definition 3.2.2 An event schema (Ba)aca is said to be part of the knowledge of 
Player(2) if there exist a (random) step number k4 such that By is measurable with 
respect to the view Y;,. 


By definition, B,4 is measurable with respect to Y,, if there is a (deterministic) 
function f,4 such that the indicator of By is equal to fu(Y;,). As mentioned in the 
caveat above, the knowledge accessible to Player(2) at each point & is completely 
described by the variable Y,. Our definition therefore formalizes well that B is part 
of the knowledge of Player(2). We allow k,4 to be random to take into account the 
fact that information becomes available to the players at random times. 


Composition of adequate correctness-measures. 

The two types of measure infyeca,pata}so PalC | B] and infyca, papas: Pa[C] can 
be combined to yield an adequate composite measure '® in the following situations. 
Player(2) holds some specific knowledge B, and uses this knowledge to enforce 
a condition B,, which depends solely on him, trying to minimize the probability 


18 : : : 
See our discussion about adequate measures in page 53. 
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of occurrence of an event C. In this case, the probability of occurrence of C’ is 
adequately!® measured by: 


inf Pa[C|Bi). 
A;P.4[B2|BiJ=1 
P,[Bi]>0 


We actually have to be more precise to ensure that this measure is adequate!®. 
In Definition 3.2.1, the formalization of “B depends solely of Player(2)” is made 
(implicitly) in terms of the probability schema (Q4,G.4, Pa)aca.’” In contrast, the 
formalization of “B, depends solely on Player(2)” is made conditioned on Bj, i.e., is 
made in terms of the probability schema 


(Bia, Ga M Bua, Pal “|/PalBi al) peas 


where B, = (Bi a)aca, GaN By s {CN Bi;C € Gy} and where A’ <= {A € 
A: P4[ By] > 0}. 


We now generalize this construction. Our discussion involves a sequence of event- 
schemas B,, By,... To justify the validity of our argument we make the following 
assumption formalizing that, for every k, B, happens before B,4, in the execution. 
(This requirement might possibly be relaxed and the argument generalized.) 


Assumption. There exists an increasing sequence of (random) values n,,72,... 
such that, for every k, By € O(Ynys Ynetis+ +++ Vangint)s 


Assume that the natural dynamics”° of the II/A structure ensure the following. 
Player(2) holds the knowledge of B,. Using this knowledge Player(2) decides single- 
handedly on By: by assumption Bz depends solely on Player(2). (As above the 
formal translation of this fact requires the use of the probability schema (Bi 4,G4 
By, Pal-|/PalBial)aea.) Player(2) then observes B; and uses this knowledge to 
decide on some B, that depends solely on him ... We symbolically let 


B,| By + B3| Bar... 


denote such scenarios. To each scenario, we can associate an adequate correctness- 
measure by iterating the previous procedure. For instance, an adequate correctness 


The notion of probability schema is presented in Definition 2.4.1. 
2°The notion of natural dynamics is presented in page 54. 
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measure associated to a scenario B, | By — Bs is given by: 


inf Pa[C| Bi, Bs]. 
A;P4[B3|Bi,B2]>0 
P,[B2|Bij=1 
P,[Bi]>0 


We thus obtain a whole family of (rather complex) adequate high-probability cor- 
rectness measures. A special but important case where we can easily write such 
adequate measures is when, for (each) odd index 22 — 1, the precondition By;_, is 
part of the knowledge of Player(2) and when By; is described in terms of the action 
taken next by Player(2).?' Indeed, in that case, the precondition B.; depends solely 
on Player(2) (conditioned on B,,..., Baj_1). 


Summary and open questions about adequate measures. 
We saw that there are two extremes when formalizing a high-probability statement 
of the form “for all adversaries, if property B holds then property C’ holds with 
probability at least ¢.” In one extreme, we restrict the analysis to within B, leaving 
Player(2) free to select the most damaging strategy it wishes. This leads to the 
measure 

inf PalC | Bl. 


This measure might be “unfair” to Player(1) as it gives to Player(2) some knowledge 
that it might not receive otherwise according to the natural dynamics of the II/A 
structure considered. The other extreme is to require that the adversary ensures B 
with probability one. This leads to the measure 
inf PalC]. 
ACA: Pa[B]=1 al 

This measure can be “unfair” to Player(2) as it forces that player to guaranty alone 
an event which might depend on both players. Any other high probability measure 
M formalizing the statement above must fall between these two measures: 


inf PalC | B])< M< inf Py[C]. 


AéA; Pa[B]>0 A€A; Pa[B]=1 


21We formalize here this statement. We will use the convenient labeling presented on page 65 to 


describe an execution: w = A1 (51, X71, Vi) Ai ($1, X1, V7) Ae (52, Xo, ¥) .... With these conven- 
_—_— Sn See ee’ 


Player(2) Player(1) Player(2) 
tions a formal translation of our statement is that, for every A in A, there exists a (random) index 
ka such that Boj-1,4 € a(Yn,) and such that Boia € o(Ar,) 
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It would be very interesting to be able to characterize the set of adequate measures 
of a given randomized algorithm. 


Existence. In particular, an interesting question is whether there always exists an 
adequate”? formalization of the expression “the highest probability of C under con- 
dition B”. If not this would mean that some correctness statements are by nature 
ill-formed. In particular, we would like to know whether Rabin’s property quoted 
in page 61 can be adequately formalized in connection with the II/A structure pre- 
sented on page 64. 


Completeness. We saw that the set of actions of Player(2) depended solely”® of 
her. Combining such actions with events that are part of her knowledge”® allowed us 
to derive (infinitely many) adequate measures associated to a randomized algorithm. 
Are all the adequate measures of this type? This would show that the method of 
conditioning by adversary and of probabilistic conditioning”? are “basis” for all other 
adequate measures. 


3.3. Proving Lower Bounds 


The previous section was devoted to formalizing adequately a statement informally 
stated. We consider in this section the technical problem to provide a lower bound 
for an expression already formalized into the form inf 4¢4 Pa[W | 1]. In general it is 
difficult to estimate directly this expression. Indeed recall that W and I are event 
schemas and that we actually have to estimate inf yc, Pa[Wa | La |. 


Let (Q,G, P) be the probability space associated to the random inputs z used by the 
algorithm. (Rabin’s algorithm is using two independent coins so that this probability 
space is easily defined in that case.) Our method for proving a high probability cor- 
rectness property of the form P,4[W|J/] consists of proving successive lower bounds: 


PalW|T] = Pal Wi | hi] 


> Pal W, | £1, 


where all the W; and J; are event schemas, and where the last two event schemas, W, 
and J,, are true events in G and do not depend on A. The final term, P4[ W, | J, ], 


*2See our discussion about adequate measures in page 53. 
*2See Definitions 3.2.1 and 3.2.2. 
24 See page 54 
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is then evaluated (or bounded from below) using the distribution P. This method 
can be in practice rather difficult to implement as it involves disentangling the ways 
in which the random choices made by the processes affect the choices made by the 
adversary. 


3.4 Rabin’s Algorithm 


The problem of mutual exclusion [16] involves allocating an indivisible, reusable 
resource among n competing processes. A mutual exclusion algorithm is said to 
guarantee progress’ if it continues to allocate the resource as long as at least one 
process is requesting it. It guarantees no-lockout if every process that requests 
the resource eventually receives it. A mutual exclusion algorithm satisfies bounded 
waiting if there is a fixed upper bound on the number of times any competing 
process can be bypassed by any other process. In conjunction with the progress 
property, the bounded waiting property implies the no-lockout property. In 1982, 
Burns et al.[12] considered the mutual exclusion algorithm in a distributed setting 
where processes communicate through a shared read-modify-write variable. For 
this setting, they proved that any deterministic mutual exclusion algorithm that 
guarantees progress and bounded waiting requires that the shared variable take 
on at least n distinct values. Shortly thereafter, Rabin published a randomized 
mutual exclusion algorithm [49] for the same shared memory distributed setting. His 
algorithm guarantees progress using a shared variable that takes on only O(log n) 
values. 


It is quite easy to verify that Rabin’s algorithm guarantees mutual exclusion and 
progress; in addition, however, Rabin claimed that his algorithm satisfies the fol- 
lowing informally-stated strong no-lockout property”®. 


“Tf process t participates in a trying round of a run of a computation by 
the protocol and compatible with the adversary, together with 0 <m-—1< 
n other processes, then the probability that i enters the critical region at 
the end of that round is at least c/m, ¢ ~ 2/3.” (*) 


This property says that the algorithm guarantees an approximately equal chance of 
success to all processes that compete at the given round. Rabin argued in [49] that 


2°We give more formal definitions of these properties in Section 3.5. 

6In the statement of this property, a ”trying round” refers to the interval between two successive 
allocations of the resource, and the ” critical region” refers to the interval during which a particular 
process has the resource allocated to it. A ”critical region” is also called a critical section”. 
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a good randomized mutual exclusion algorithm should satisfy this strong no-lockout 
property, and in particular, that the probability of each process succeeding should 
depend inversely on m, the number of actual competitors at the given round. This 
dependence on m was claimed to be an important advantage of this algorithm over 
another algorithm developed by Ben-Or (also described in [49]); Ben-Or’s algorithm 
is claimed to satisfy a weaker no-lockout property in which the probability of success 
is approximately c/n, where n is the total number of processes, i.e., the number of 
potential competitors. 


Rabin’s algorithm uses a randomly-chosen round number to conduct a competition 
for each round. Within each round, competing processes choose lottery numbers 
randomly, according to a truncated geometric distribution. One of the processes 
drawing the largest lottery number for the round wins. Thus, randomness is used 
in two ways in this algorithm: for choosing the round numbers and choosing the 
lottery numbers. The detailed code for this algorithm appears in Figure 3.1. 


We begin our analysis by presenting three different formal versions of the no-lockout 
property. These three statements are of the form discussed in the introduction and 
give lower bounds on the (conditional) probability that a participating process wins 
the current round of competition. They differ by the nature of the events involved 
in probabilistic conditioning, those involved in conditioning by adversary and by the 
values of the lower bounds. 


Described in this formal style, neither of the two forms of conditioning — probabilistic 
conditioning and conditioning by adversary — provides an adequate formalization?’ 
of the fact that m processes participate in the round. We show in Theorem 3.6.1 
that, if probabilistic conditioning is selected, then the adversary can use this fact in 
a simple way to lock out any process during any round. 


On the other hand, the weak c/n no-lockout property that was claimed for Ben-Or’s 
algorithm involves only conditioning over events that describe the knowledge of the 
adversary at the end of previous round. We show in Theorems 3.6.2 and 3.6.4 that 
the algorithm suffers from a different flaw which bars it from satisfying even this 


property. 

We discuss here informally the meaning of this result. The idea in the design of 
the algorithm was to incorporate a mathematical procedure within a distributed 
context. This procedure allows one to select with high probability a unique random 
element from any set of at most n elements. It does so in an efficient way using 
a distribution of small support (“small” means here O(log n)) and is very similar 
to the approximate counting procedure of [20]. The mutual exclusion problem in a 


27 “adequate” in the sense of Section 3.2. 
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distributed system is also about selecting a unique element: specifically the prob- 
lem is to select in each trying round a unique process among a set of competing 
processes. In order to use the mathematical procedure for this end and select a true 
random participating process at each round and for all choices of the adversary, it 
is necessary to discard the old values left in the local variables by previous calls of 
the procedure. (If not, the adversary could take advantage of the existing values.) 
For this, another use of randomness was designed so that, with high probability, at 
each new round, all the participating processes would erase their old values when 
taking a step. 


Our results demonstrate that this use of randomness did not actually fulfill its 
purpose and that the adversary is able in some instances to use old lottery values 
and defeat the algorithm. 


In Theorem 3.6.6 we show that the two flaws revealed by our Theorems 3.6.1 and 
3.6.2 are at the center of the problem: if one restricts attention to executions where 
program variables are reset, and if we condition by adversary on the number m of 
participating processes then the strong bound does hold. Our proof, presented in 
Proposition 3.6.7, highlights the general difficulties encountered in our methodology 
when attempting to disentangle the probabilities from the influence of A. 


The algorithm of Ben-Or which is presented at the end of [49] is a modification of 
Rabin’s algorithm that uses a shared variable of constant size. All the methods that 
we develop in the analysis of Rabin’s algorithm apply to this algorithm and establish 
that Ben-Or’s algorithm is similarly flawed and does not satisfy the 1/2en no-lockout 
property claimed for it in [49]. Actually, in this setting, the shared variables can take 
only two values, which allows the adversary to lock out processes with probability 
one, as we show in Theorem 3.6.9. 


In a recent paper [36], Kushilevitz and Rabin use our results to produce a modi- 
fication of the algorithm, solving randomized mutual exclusion with log,’n values. 
They solve the problem revealed by our Theorem 3.6.1 by conducting before round & 
the competition that results in the control of Crit by the end of round k. And they 
solve the problem revealed by our Theorem 3.6.2 by enforcing in the code that the 
program variables are reset to 0. 


The remainder of this chapter is organized as follows. Section 3.5 contains a descrip- 
tion of the mutual exclusion problem and formal definitions of the strong and weak 
no-lockout properties. Section 3.6 contains our results about the no-lockout proper- 
ties for Rabin’s algorithm. It contains Theorems 3.6.1 and 3.6.2 which disprove in 
different ways the strong and weak no-lockout properties and Theorem 3.6.6 whose 
proof is is a model for our methodology: a careful analysis of this proof reveals ex- 
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actly the origin of the flaws stated in the two previous theorems. One of the uses of 
randomness in the algorithm was to disallow the adversary from knowing the value 
of the program variables. Our Theorems 3.6.2 and 3.6.8 express that this objective 
is not reached and that the adversary is able to infer (partially) the value of all 
the fields of the shared variable. Theorem 3.6.9 deals about the simpler setting of 
Ben-Or’s algorithm. 


Some mathematical properties needed for the constructions of Section 3.6 are pre- 
sented in an appendix (Section 3.7). 


3.5 The Mutual Exclusion Problem 


The problem of mutual exclusion is that of continually arbitrating the exclusive 
ownership of a resource among a set of competing processes. The set of competing 
processes is taken from a universe of size n and changes with time. A solution to 
this problem is a distributed algorithm described by a program (code) C having the 
following properties. All involved processes run the same program C. C is partitioned 
into four regions, Try, Crit, Fvit, and Rem which are run cyclically in this order 
by all processes executing C. A process in Crit is said to hold the resource. The 
indivisible property of the resource means that at any point of an execution, at most 
one process should be in Crit. 


3.5.1 Definition of Runs, Rounds, and Adversaries 


In this subsection, we define the notions of run, round, adversary, and fair adversary 
which we will use to define the properties of progress and no-lockout. 


A run p of a (partial) execution w is a sequence of triplets {(p,, old,, new,), (po, olde, 
new), ... (pr, old;, new,) ...} indicating that process p, takes the t’” step in w and 
undergoes the region change old, — new, during this step (e.g., old; = new, = Try 
or old, = Try and new, = Crit). We say that w is compatible with p. 


An adversary for the mutual exclusion problem is a mapping A from the set of 
finite runs to the set {1,...,n} that determines which process takes its next step 
as a function of the current partial run. That is, the adversary is only allowed to 
see the changes of regions. For every ¢ and for every run p = {(pi,0ld,,new,), 
(po, old, news),...}, Al{(pi, oldy, new,),..., (pr, old:, new,)}] = piy1. We then say 
that p and A are compatible. 
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The associated I[/A-structure. We show how these definitions can be formal- 
ized into a JI/A-structure within the general model presented in Definition 2.3.1, 
page 36. Recall that we consider here only deterministic adversaries, i.e., adversaries 
for which, for every k, the k™ action of Player(2), Ay, is a deterministic function of 
the view Y;,. 


To simplify the exposition we will slightly change the notations of Chapter 2, page 41, 
in the following way. We write here a random execution w as 


wr Ay (51, X1, Yi) At (5), X47, Yy) A» ($2, X2, Yo) wae 
Sm a 


Player(2) Player(1) Player(2) 


A, is the k™ action taken by Player(2). S,, X, and Y;, are respectively the state 
of the system, the view of Player(1) and the view of Player(2) resulting after this 
action. Similarly we let Aj, $i, X{ and Y;/ denote the k™ action taken by Player(1), 
the state of the system, and the views of the two players resulting after this action. 


e The set of states, S, is the set of tuples containing the value of the program 
counters pc,,..., pc,, the values of all the local variables and the value of the 
shared variable. 


The actions A; take values in {1,...,n}. 


The actions Aj, are described by the code of the algorithm given in Figure 3.1, 
page 74. (We will not make these actions more explicit.) 


e The views X;, take value in S' x {1,...,n}. (If the view of Player(1) is (s,2), 
then i represents the process to take a step next.) 


The views Y;, take value in {runs of length (k — 1)} x {1,...,n}. (% = (y,%) 
means that the previous view of Player(2) was Y/_, = y and Player(2) just 
selected 7.) 


e The set of views Xj is equal to $. (Player(1) just remembers the state of the 
system after its step.) 


The views Y;/ take value in {runs of length k}. 


The update functions are described as follows. Assume that (9;, X%, Y,) = (s, (2,2), 
(y,t)). Then (51, X14, Y4) = fCS:, Xn, Vn, Ay) = (5, 5’, (y, 2, new;)),?8 where s’ is the 
state of the system after the k*” move of Player(1) and new; is the region reached by 


?8For simplicity we do not recall old; which is recorded in y. 
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process ?. Similarly assume that (5%, X7, Y/) = (s,2,y). Then (S41, X41, Yeri) = 
g(s, 2, y,%) = (s,(@,2),(y,7)), where 7 is the process selected by Player(2) for round 
k+1. 


This defines the II/A-structure associated to Rabin’s randomized algorithm for mu- 
tual exclusion. The construction given in Chapter 2, page 36, defines the associated 
probability schema (Q4,G.4, Pa)aca over which we will conduct the analysis. 


In this model Player(1) “forgets” systematically all the past, knows the current 
state and learns what the last action of Player(2) is. By contrast, we will consider in 
Chapter 7 an example where Player(1) learns nothing about the moves of Player(2), 
consequently knows only partially the state, and remembers everything about its 
past actions. 


For every adversary A, an execution w in Q, is in Fair, if every process 2 in Try, 
Crit, or Exit is eventually provided by A with a step. This condition describes 
“normal” executions of the algorithm and says that processes can quit the compe- 
tition only in Rem. The number of states, actions and views being finite we can 
express Fair, as an expression involving (only) countably many rectangles.?° This 
establishes that Fair, € Gy and that the family Fair = (Fair,),aca is an event- 
schema. An adversary A is fair if the executions produced under the control of A 
are in Fair with probability one, i.e., if P4[Fair4] = 1. This definition was also given 
in Vardi [58], page 334. Player(2) is fair if every A € A is fair.°° 


In this chapter, we choose the set A of admissible adversaries to be the set of fair 
adversaries. 


A round of an execution is the part between two successive entrances to the critical 
section (or before the first entrance). More specifically, it is a maximal execution 
fragment of the given execution, containing one transition Try — Crit at the end 
of this fragment and no other transition Try — Crit. The round of a run is defined 


*°See page 42 for a definition of a rectangle and page 43 for a definition of an event-schema. 

°°The probabilistic notion of fairness allows more flexibility in the definition of adversaries then 
requiring that all executions be in Fair. The following example illustrates that fact. Consider two 
processes each running the simple code: “Flip a fair coin”. This means that process i (¢ = 1 or 
2) flips a coin whenever allocated a step by Player(2). An execution is in Fair if both processes 
take infinitely many steps. Consider the adversary A defined by: “I. Allocate steps repeatedly 
to process 1 until Head comes up. Then allocate steps repeatedly to process 2 until Head comes 
up. Then go back to I and repeat.” This adversary is fair in the probabilistic sense: Pa[Fatr] = 1. 
Nevertheless it produces (with probability zero) some executions that are not in Fatr. 
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similarly. For every k, we let round(k) denote the k round. Formally, round(k) 
is a variable-schema®! round(k) = (round a(k))aca: round ,(k) is the random k™” 
round of the generic execution obtained when the adversary is A. (For completeness 
we write round 4(k) = L if the execution has less then & rounds.) 


A process 2 participates in a round if ¢ takes a step while being either in its trying 
section Try or at rest in its section Rem. Hence, for a participating process old; € 
Rem, Try and new; € Try, Crit. 


3.5.2 The Progress and No-Lockout Properties 


Definition 3.5.1 An algorithm C that solves mutual exclusion guarantees progress 
if for all fair®? adversaries there is no infinite execution in which, from some point 
on, at least one process is in its Try region (respectively its Exit region) and no 
transition Try — Crit (respectively Exit + Rem) occurs. 


Recall that the notion of fair adversary is probabilistic. But the notion of mutual ex- 
clusion is not probabilistic: we require that, for all fair adversaries and all executions 
win Fair,, w have the property enunciated in Definition 3.5.1. 


We now turn towards the no-lockout property. This property is probabilistic. Its 
formal definition requires the following notation: 


For every adversary A, let X4 denote any generic quantity whose value changes as 
the execution unfolds under the control of A (e.g., the value of a program variable). 
We let X4(k) denote the value of X4 just prior to the last step (Pry — Crit) of 
the &th round of the execution. As a special case of this general notation, we define 
the following. 


e Pa(k) is the set of participating processes in round k. (Set Pa(k) = @ if w 
has fewer then k rounds.) The notation P4(k) is consistent with the general 
notation because the set of processes participating in round k is updated as 
round & progresses: in effect the definition of this set is complete only at the 
end of round k. (This fact is at the heart of our Theorem 3.6.1). 


ta(k) is the total number of steps that are taken by all the processes up to 
the end of round k. 


"See Definition 2.4.1. 
°?The mention of the fairness is put here just as a reminder: recall that, in this chapter, all the 
admissible adversaries are fair. 
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e N4(k) is the set of executions in which all the processes j participating in 
round k reinitialize their program variables B; with a new value §;(k) during 
round k. (AN stands for New-values.) §;(k); k = 1,2,..., 7 = 1,...,nisa 
family of iid 3? random variable whose distribution is geometric truncated at 
log,n +4 (see [49]). 


e For each i, W; 4(k) denotes the set of executions in which process 2 enters the 
critical region at the end of round &. 


We consistently use the probability theory convention according to which, for any 
property S4, the set of executions {w € Qy4 : w has property S4} is denoted as 
{Sa}. Then: 


e For each step number ¢ and each execution w € Q4 we let 7; 4(w) denote the 
run compatible with the first ¢ steps of w. For any ¢-steps run p, {74 = p} 
represents the set of executions compatible with p. ({m4 = p} = 9 if p has 
fewer then ¢ steps.) We will use 7; 4 in place of m4)4 to simplify notation. 


Note that the definition of 7,4 can be made independently of any adversary 
A, justifying the simpler notation 7,. We nevertheless keep the subscript A to 
emphasize that, for every A, 7.4 is defined on (Q4,G4) which depends on A. 


Similarly, for all m <n, {|Pa(k)| = m} represents the set of executions having 
m processes participating in round k. 


For every A, The quantities Nu(k),{t:4 = p}, Wialk), {|Pa(k)| = m}, {i € 
Pa(k)} are sets of executions. We actually easily check that they are all events in 
the o-field G4. The care that we showed by keeping the reference to A in 7 4 is 
justified here by the fact that {7:4 = p} is an event in G4 which does depend on 
A. These families of events naturally lead to event-schemas N(k), {7 = p}, Wi(k), 
{|P(k)| = m}, {i € P(k)}. For instance, N(k) = (Na(k))aca and {a = p} = 
({7:,.4 = p})aea- The analysis will consider these event schemas in connection with 
the probability schema (Qu, Ga, Pa)aca- 


We now present the various no-lockout properties that we want to study. All are 
possible formalizations of statement (*) given on page 61. Of great significance to us 
is their adequacy**: a measure which is not adequate does not confer valuable infor- 
mation about the algorithm studied. To simplify the discussion we will sometimes 


“Recall that iid stands for “independent and identically distributed”. 
**See our discussion about the notion of adequate measures on page 53. We also review that 
notion shortly here. 
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use the notation C, B,, Bs, Bs and By in place of W;(k), {i € P(k)}, {ae_1 = p}, 
{|P(k)| = m}, and Nk), respectively. As suggested by our notation, we will con- 
sider C' to be the target property *° and B,, By, B3 and By, to be the preconditions. 
The target property can be considered only in conjunction with the precondition 
B,. Hence all the measures we will consider will have B, as a precondition. The 
other preconditions can be introduced or omitted, each choice corresponding to a 
different measure (actually to two different measures as we now discuss). 


We argued at the beginning of this chapter that a measure reflects an actual prop- 
erty of the I[/A-structure only if it does not perturb the dynamics of the game 
between Player(1) and Player(2). We then say that the measure is adequate. All 
the measures that we will consider formalize the preconditions using either proba- 
bilistic conditioning or conditioning by adversary.*° (At this point we know of no 
other method to construct adequate measures.) As we will see neither of these two 
methods allows to treat adequately the precondition B;. This is unfortunate be- 
cause, as is mentioned in [49], a “good” measure of no-lockout should be expressed 
in terms of m, the actual number of participating processes in a round. In spite 
of this fact we will treat the preconditions in the most adequate way, envisioning 
various alternatives when no adequate formulation is available. 


We have actually in mind to compute the probability of the target W;(k) at different 
(random) points s, of the execution. The execution-fragment previous to s, is 
(obviously) determined at the point s,. The definition of Rabin’s IL[/A-structure 
implies that Player(2) then knows the run associated to that execution-fragment. 
As is explained on page 54, the natural way to account for this situation is to use 
probabilistic conditioning on the knowledge held by Player(2) at that point s,. We 
will consider two cases, when 5, is the beginning of the execution and when it is the 
beginning of round k. In the latter case the knowledge held by Player(2)is the past 
run p. Hence the adequate formalization of this case is obtained by probabilistically 
conditioning on By = {m,_, = p}. In the former case — when s; is the beginning 
of the execution — there is no past and the corresponding adequate formalization 
consists in simply omitting B. from the measure. 


We now turn to By = {i € P(k)}. As mentioned above the target C = W,(k) can 
be considered only when this precondition is considered. In this case the situation 
is not as clear as for B.: the fact that a process 7 participates in round & depends 
on both the strategy A used by Player(2) and on the random values drawn by the 
algorithm during the round. Indeed, on the one hand 7 can participates only if 


*°The notions of “target property” and “precondition” are defined on page 48. 
“These notions are defined on page 54. 
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Player(2) plans to schedule it. On the other hand, ¢ can also participate only if no 
process scheduled before its turn comes does succeed into Crit. This depends (in 
part) on the random lottery values drawn by these processes. 


There is one process though, for which the situation is unambiguous: the first 
one scheduled in round k by Player(2). This one depends solely on Player(2) (in 
the sense of Definition 3.2.1). When proving negative results (i.e., disproving the 
correctness of the algorithm) we can therefore consider that process: if correct, the 
algorithm should in particular ensure a good success rate for this specific process. 


On the other hand, when proving positive results (i.e., proving that the algorithm 
satisfies some correctness-measure) we will, by default of an adequate measure, 
give Player(2) more power then it would in any adequate situation. Indeed, the 
correctness of the algorithm in that case implies the correctness for any adequate 
measure, if any exists. We achieve this by letting Player(2) know the identity of the 
process 2 with respect to which the test is conducted. Formally, this means that we 
use probabilistic conditioning on {i € P(k)}. 


We now turn to Bs, the most important character in the cast of B’s. (Recall again 
that the whole purpose of the algorithm is to achieve a measure of fairness with 
respect to the m currently participating processes.) Unfortunately, in that case, we 
have no means in our panoply to interpret adequately this precondition ... unless 
m = 1, in which case statement (*) of page 61 is vacuously true. Indeed, consider 
first probabilistically conditioning on Bs. This means, as we saw, letting Player(2) 
know the number of participating processes and then letting it act as it wishes 
(provided that Player(2) allows with non-zero probability that |P(k)| = m). This 
gives a definite power to Player(2): Player(2) is in fact the main power in the 
determination of m. Our Theorem 3.6.1 expresses that fact and shows that the 
algorithm is incorrect ... for the inadequate correctness measure considered. 


The other possibility at our disposal is to condition by adversary®” on B3. This 
does not provide an adequate measure either. As we will show in Lemma 3.6.5 this 
constrains in essence Player(2) to give steps to all participating processes when the 
critical section is still closed. This implies in particular that Player(2) cannot play 
on the order according to which it schedules the processes: that is precisely the 
weapon used by Player(2) to defeat the previous measure. This restriction can be 
viewed as “unfair” to Player(2): it ties its hands in a way that does not correspond 
to the natural dynamics of Rabin’s I[/A-structure. Nevertheless our Theorem 3.6.2 
shows that this over-constrained Player(2) still manages to defeat the algorithm. 


27See page 54. 
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The measure considered being the most restrictive towards Player(2), any measure 
formalizing adequately that |P(k)| = m (if any exists) would similarly yield a defeat 
of the algorithm. 


This brings us to the last of the B event-schemas: By, = N(k). This event is 
similarly neither part of the knowledge of Player(2)°** at the beginning of round 
k nor depending solely on him.?? Thus, as we already saw several times, both 
methods of probabilistic conditioning and conditioning by adversary are inadequate. 
The most unfavorable to Player(1) is the probabilistic conditioning method. A 
correctness result in that case will therefore imply correctness for any other adequate 
formalization of N’(k), if any exits. This is why probabilistic conditioning is used in 
Theorem 3.6.6. 


We are now at last in a position to present the various measures. We use the following 
notations. The term weak refers to the fact that a 1/n lower bound on the probabil- 
ity is sought. The term strong refers to a 1/m lower bound. The terms/notations 7, 
run, m and renew refer to B,, By, Bs and B, respectively. The term knowing refers 
to a probabilistic conditioning. For instance (run,?,m and renew)-knowing summa- 
rizes that probabilistic conditioning is performed on B,, By, B; and By. The term 
imposing refers to a conditioning by adversary. For instance t-imposing summarizes 
that conditioning by adversary is performed on P,. 


The first two definitions involve evaluating the probabilities “at the beginning of 
round k”. The measure infyca Pa[Wi(&) | t.-1 = p] used in the first definition 
is adequate (in the sense defined on page 53). It corresponds to a probabilistic 
measuring obtained for Player(2) sitting at the beginning of round & (and hence 
knowing the past (k —1)-run p) and scheduling the first step of round k to process 2. 
As discussed on page 70, this measure is too specific to provide a relevant measure 
of performance of the algorithm. Nevertheless it is a good measure to establish a 
negative result: if correct, the algorithm should in particular ensure a good success 
rate for the specific process 7. The measures used in the other two definitions are not 
adequate (in the sense defined on page 53). We argue this point after the definitions. 


Definition 3.5.2 (Weak, Run-knowing, i-imposing, Probabilistic no-lock- 
out) A solution to the mutual exclusion problem satisfies weak, run-knowing, i- 
imposing probabilistic no-lockout whenever there exists a constant ¢ such that, for 


°8See Definition 3.2.2. 
°°See Definition 3.2.1. 
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every k > 1, every (k — 1)-round run p and every process i, 


inf Pal W,(k) | tm-1 = p| > c/n. 
A;P4[ieP(h) | m—1=p ]=1 
Paltz-1=p]>0 


Definition 3.5.3 (Strong, Run & m-knowing, i-imposing, Probabilistic no- 
lockout) The same as in Definition 3.5.2 except that: 


inf Pal Wik) | te-1 = p, |P(k)| = m] > e/m. 
A;Pa[i€P(h) | me-1=0, [P(R)l=m ] =1 
Palte-1=p, |P(k)|=m]>0 


The next definition is the transcription of the previous one for the case where the 
probability is “computed at the beginning of the execution” (i.e., s, = 0 for all k). 


Definition 3.5.4 (Strong, m-knowing, i-imposing, Probabilistic no-lockout ) 
The same as in Definition 3.5.2 except that: 


inf Pal Wi(k) | m= |P(k)|] > e/m. 
A;P.a[ieP(k) | /P(R)|=m] =1 
PallP(k)|=m]>0 


We argue that the last two definitions are not adequate: both involve probabilis- 
tically conditioning on the number m of participating processes in round k. As 
mentioned above, the two definitions correspond to Player(2) sitting either at the 
beginning of round & or at the beginning of the execution. In both situations, the 
value of |P(k)| is not part of the knowledge*® of Player(2). Therefore probabilistic 
conditioning on the value m of |P(k)| yields an inadequate measure of performance. 


By integration over p we see that an algorithm having the property of Definition 3.5.3 
is stronger then one having the property of Definition 3.5.4. Equivalently, an adver- 
sary able to falsify Property 3.5.4 is stronger then one able to falsify Property 3.5.3. 


3.6 Our Results 


Here, we give a little more detail about the operation of Rabin’s algorithm than 
we gave earlier in the introduction. At each round k a new round number R is 


“See Definition 3.2.2. 
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selected at random (uniformly among 100 values). The algorithm ensures that any 
process ? that has already participated in the current round has R; = R, and so 
passes a test that verifies this. The variable R acts as an “eraser” of the past: 
with high probability, a newly participating process does not pass this test and 
consequently chooses a new random number for its lottery value B;. The distribution 
used for this purpose is a geometric distribution that is truncated at b = log,n +4: 
P| 6;(k) =U = 27' for 1 < 6-1. The first process that checks that its lottery value 
is the highest obtained so far in the round, at a point when the critical section is 
unoccupied, takes possession of the critical section. At this point the shared variable 
is reinitialized and a new round begins. 


The algorithm has the following two features. First, any participating process i 
reinitializes its variable B; at most once per round. Second, the process winning 
the competition takes at most two steps (and at least one) after the point f;, of 
the round at which the critical section becomes free. Equivalently, a process 7 that 
takes two steps after f, and does not win the competition cannot hold the current 
maximal lottery value. (After having taken a step in round k a process i must hold 
the current round number ie., R;(k) = R(k). On the other hand, the semaphore $ 
is set to 0 after f;,. If ¢ held the highest lottery value at its second step after f, it 
would pass all three tests in the code and enter the critical section.) We will take 
advantage of this last property in our constructions. 


We are now ready to state our results. The first result states that the strong m- 
knowing correctness property does not hold unless n < 2. 


Theorem 3.6.1 The algorithm does not have the strong no-lockout property of Defi- 
nition (3.5.4) (and hence of Definition 3.5.3). Indeed, ifn > 3, there is an adversary 
A such that, for all rounds k, for allm,2<m <n, 


Proor. Assume first that 2<m <n-—J1. Consider the following adversary A. A 
does not use its knowledge about the past run p (which is granted to Player(2) by the 
I[/A-dynamics), gives one step to process 1 while the critical section is occupied, 
waits for Fait and then adopts the schedule 2,2,3,3,...,n,n,1. This schedule 
brings round k to its end, because of the second property mentioned above (i.e., 
all processes are scheduled for two steps, one of which when the critical section is 
empty). This adversary is such that m wins with non zero probability i.e., Palm = 


|P(&)|] > 0. Also 1 is scheduled first so that, obviously, P4[l € P(k)|m = |P(k)|] 
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Shared variable: V = (5, B, R), where: 
S € {0,1}, initially 0 
B€ {0,1,..., flog n] + 4}, initially 0 
Re {0,2,...,99}, initially random 


Code for 2: 
Local variables: 
B; € {0,..., flog n] + 4}, initially 0 
R; € {0,1,...,99}, initially L 
Code: 
while V # (0, B;, R;) do 
if (V.R# R;) or (V.B < B;) then 
B; — random 
V.B — max(V.B, B;) 
R,;—V.R 
unlock; lock; 
V — (1,0, random) 
unlock; 
** Critical Region ** 
lock; 
VS —0 
Ro- 1 
B;—0 
unlock; 
** Remainder Region ** 


lock; 


Figure 3.1: Rabin’s Algorithm 
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1. But, for this adversary, |P(k)| = m happens exactly when process m wins so that 


PalWi(k) | m = |P(k)I] = 0. 


Consider now the case where m = n. We consider then the adversary which gives 
one step to process 2 while the critical section is occupied, waits for Kwit and then 
adopts the schedule 1,1,3,3,...,n,n,2. As above, this schedule brings round k to 
its end. Similarly, P,4[|P(’)| = n] > 0, namely when 2 holds the highest lottery 
value. Also, 1 is scheduled with certainty. We now show that 1 must have a smaller 
lottery then 2 and hence cannot win access to Crit. Indeed, otherwise 1 would win 
and |P(k)| would be equal to 2. This is a contradiction as we assume that |P(k)| is 
n and as, by assumption, n is bigger then 2. 


The previous result is not too surprising in the light of our previous discussion. The 
measure 
inf Pa[Wi(k) | m= |P(k)|] 
A;Pa[1eP(k) |m=|P(h)| ]=1 
Pa[|P(k)|=m]>0 


is not adequate and Player(2) punishes us for using it: in “normal times”, i-e., under 
the dynamics of the II/A structure, Player(2) is provided with only incomplete 
information about the past, and definitely no information about the future. But 
our inadequate measure gives her the future information |P(k)| = m, allowing her 
to target a specific strategy against any process. 


We now give in Theorem 3.6.2 the more damaging result, stating (1) that, in spite 
of the randomization introduced in the round number variable R, Player(2) is able 
to infer the values held in the local variables and (2) that it is able to use this 
knowledge to lock out a process with probability exponentially close to 1. This result 
is truly damaging because the measure used is adequate: our result expresses that 
the algorithm is “naturally” incapable to withstand the machinations of Player(2). 


Theorem 3.6.2 There exists a constant c < 1, an adversary A, a round k and a 
k — 1-round run p such that: 


We need the following definition in the proof. 


Definition 3.6.1 Let i be a round. Assume that, during round |, Player(2) adopts 
the following strategy. It first waits for the critical section to become free, then gives 
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one step to process j and then two steps (in any order) to s other processes. (We 
will call these test-processes.) Assume that at this point the critical section is still 
available (so that round | is not over). We then say that process j is an s-survivor 


(at round 1). 


The idea behind this notion is that, by manufacturing survivors, Player(2)is able 
to select processes having high lottery values. We now describe in more detail the 
selection of survivors and formalize this last fact. 


In the following we will consider an adversary constructing sequentially a family 
of s-survivors for the four values s = 2les2"+t, ¢ = —1,...,—5. Whenever the 
adversary manages to select a new survivor it stores it, i.e, does not allocates it 
any further step until the selection of survivors is completed. (A actually allocates 
steps to selected survivors, but only very rarely, to comply with fairness. Rarely 
means for instance once every nJ” steps, where T is the expected time to select 
an n/2-survivor.) By doing so, A reduces the pool of test-processes still available. 
We assume that, at any point in the selection process, the adversary selects the 
test-processes at random among the set of processes still available. (The adversary 
could be more sophisticated then random, but this is not needed.) Note that a new 
8-survivor can be constructed with probability one whenever the available pool has 
size at least s+ 1: it suffices to reiterate the selection process until the selection 
completes successfully. 


Lemma 3.6.3 There is a constants d such that for any t = —5,...,—-1, for any 
gloesn+_ survivor j, for any a=0,...,5 


Pal B(1) = logn+t+a]>d. 


Proor. Let s denote 2'°"+", Let 7 be an s-survivor and 7,,%9,...,2, be the test- 
processes used in its selection. Assume also that 7 drew a new value B;(l) = (;(0) 
(this happens with probability q, = .99 .) Remark that B;(1) = Max{ B;,(1),..., 
B;,(1), B())}: if this were not the case, one of the test-processes would have entered 
Crit. As the test processes are selected at random, each of them has with probability 
.99 a round number different from R(/) and hence draws a new lottery number 3; (J). 
Hence, with high probability gz. > 0, 90% of them do so. The other of them keep 
their old lottery value B;(/ — 1): this value, being old, has lost in previous rounds 
and is therefore stochastically smaller *! then a new value (;(/). (An application 


‘TA real random variable X is stochastically smaller then another one Y (we write that: X << Y) 
exactly when, for all €R, P[X >2]< PLY >]. Hence, if X < Y in the usual sense, it is also 
stochastically smaller. 
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of Lemma 8.1.5 formalizes this.) Hence, with probability at least qiq. we have the 
following stochastic inequality: 


Max{/3,(J), oe «+ Bs-90/100$ <c B; (1) <c Max{ (J), ve 5 Bs4i(l)}. 


Corollary 3.7.4 then shows that, for a = 0,...,5, with probability at least q.qe, 
P,(B;(1) = log,s + a] > q3 for some constant qs (qs is close to 0.01). Hence, with 


probability at least d= q, qoqs, B; (0) is equal to logs + a. 


Proor of Theorem 3.6.2. The adversary uses a preparation phase to select and 
store some processes having high lottery values. We will, by abuse of language, 
identify this phase with the run p which corresponds to it. When this preparation 
phase is over, round & begins. 


Preparation phase p: For each of the five values log,n+t, # = —5,...,—1, A selects in 
the preparation phase many (“many” means n/20 for ¢ = —5,...,—2 and 6n/20 for 
¢ = —1) 2!°82"+'_survivors. Let $, denote the set of all the survivors thus selected. 


(Note that |5;| = n/2 so that we have enough processes to conduct this selection). 
By partitioning the set of 2!°82"-1-survivors into six sets of equal size, for each of 
the ten values ¢ = —5,...,4, A has then secured the existence of n/20 processes 
whose lottery value is log,n + ¢ with probability bigger then d. (By Lemma 3.6.3.) 


Round k: While the critical section is busy, A gives a step to each of the n/2 processes 
from the set Sy that it did not select in phase p. (We can without loss of generality 
assume that process 1 is in that set S,: hence Pali € P(k)| = 1 which was to be 
verified.) When this is done, with probability at least 1 — 2~%* (see Corollary 3.7.2) 
the program variable B holds a value bigger or equal then log.n —5. The adversary 
then waits for the critical section to become free and gives steps to the processes 
of S; it selected in phase p. A process in S» can win access to the critical section 
only if the maximum lottery value Bs, s Max; ¢s, B; of all the processes in 5» is 
strictly less then log,n —5 or if no process of 5 holds both the correct round number 
R(k) and the lottery number Bs,. This consideration gives the bound predicted in 
Theorem 3.6.2 with ¢ = (1 — d/100)'/*°. 


The lesson brought by this last proof is that the variable R does not act as an eraser 
of the past as it was originally believed and that the adversary can correspondingly 
use old values to defeat the algorithm. 


Furthermore, our proof demonstrates that there is an adversary that can lock out, 
with probability exponentially close to 1, an arbitrary set of n/2 processes during 
some round. With aslight improvement we can derive an adversary that will succeed 
in locking out (with probability exponentially close to 1) a given set $3 of, for 
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example, n/100 processes at all rounds: we just need to remark that the adversary 
can do without this set 53 during the preparation phase p. The adversary would 
then alternate preparation phases p,,po,... with rounds k,,k.,... The set $3 of 
processes would be given steps only during rounds k,,k.,... and would be locked 
out at each time with probability exponentially close to 1. 


In view of our counterexample we might think that increasing the size of the shared 
variable might yield a solution. For instance, if the geometric distribution used by 
the algorithm is truncated at the value b = 2 log,n instead of log,n + 4, then the 
adversary is not able as before to ensure a lower bound on the probability that an 
n/2-survivor holds 6 as its lottery value. (The probability is given by Theorem 3.7.1 
with « = logn.) Then the argument of the previous proof does not hold anymore. 
Nevertheless, the next theorem establishes that raising the size of the shared variable 
does not help as long as the size stays sub-linear. But this is exactly the theoretical 
result the algorithm was supposed to achieve. (Recall the n-lower bound of [12] 
in the deterministic case.) Furthermore, the remark made above applies here also: 
a set of processes of linear size can be locked out at each time with probability 
arbitrarily close to 1. 


Theorem 3.6.4 Suppose that we modify the algorithm so that the set of possible 
round numbers used has size r and that the set of possible lottery numbers has size 
b (loggn+4<b<n). Then there exists positive constants c, and cz, an adversary 
A, and a run p such that 


Pal Wi(k) | m1 =p, LEP(k)] <eP $e" +s 
Pall € P(k) | te-1 = p| =1 
Paltp—1 = p| >. 


Proor. We consider the adversary A described in the proof of theorem 3.6.2: for 
t= —5,...,-2, A prepares a set T; of 2!82"+'-survivors, each of size n/20, and a 
set T_, of 2!°82"-1-survivors; the size of T_, is 6/20n. (We can as before think of 
this set as being partitioned into six different sets.) We let 7 stand for 6/20 in the 
sequel. 


Let p; denote the probability that process 1 holds / as its lottery value after having 
taken a step in round k. For any process 7 in S_, let also q denote the probability 
that process 7 holds / as its lottery value at the end of the preparation phase p. 


The same reasoning as in Theorem 3.6.2 then leads to the inequality: 
Pal Wi(k) | te-1 = p, 1 © P(A) ] 
Se 4 (Let df 4 SD (a — Sym. 


I>logont+5 r 
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Write | = log,n + x — 1 = log,(n/2) + 2. Then, as is seen in the proof of Corol- 
lary 3.7.4, q = e7? °2!-* for some ¢ € (a,a +1). For 1 > log,n +5, « is at least 6 
and e~?' ° ~ 1 s0 that q ~ 2!-$ > 2!-*. On the other hand p, = 27! = 2-*t1 In, 


Define b(a) = e7?"""/" so that u/(2) = ew? "1/2!" nn /r. Then: 


S- Pm(L— ym 


I>loga+5 L>6 
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The next result, Theorem 3.6.6, shows that the two flaws exhibited in Theorems 3.6.1 
and 3.6.2 are at the core of the problem: the algorithm does have the strong no- 
lockout property when we condition by adversary** on the property {|P(k)| = m} 
and when we force the algorithm to draw new values for the modified internal 
variables. 


Conditioning by adversary specifically solves the problem expressed by Theorem 3.6.1: 
the measure analyzed in that theorem is inadequate*® and allows too much knowl- 
edge to Player(2). On the other hand, forcing the adversary to reset to new values 
the internal variables of the participating processes resolves the problem revealed 
by (the proof of) Theorem 3.6.2. We will prove these facts in Theorem 3.6.6 for 
a slightly modified version of the algorithm. Recall in effect that the code given 
in Page 74 is optimized by making a participating process 7 draw a new lottery 


“2 See page 54 for a definition of conditioning by adversary. 
“8See page 53 for a discussion of adequate measures. 
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number when it is detected that V.B < B;. For simplicity, we will consider the 
“de-optimized” version of the code in which only the test V.R # R; ? causes of a 
new drawing to occur. It is clear that a correctness statement for that de-optimized 
algorithm implies a similar result for the original algorithm. 


We proved that result before we were fully aware of the notion of adequate measures. 
Our original result is presented in Proposition 3.6.7 and is based on the notion of 
restricted adversary presented in Definition 3.6.2. As is shown in Lemma 3.6.5, a 
restricted adversary is exactly one that decides with probability one the set of par- 
ticipating processes. This result easily establishes the equivalence of Theorem 3.6.6 
and Proposition 3.6.7. 


Definition 3.6.2 A step taken after the time at which Crit becomes free in round 
k is called a k-real step. We say that an adversary is k-restricted when, in round k, 
the set of participating processes is composed exactly of the set of processes scheduled 
when Crit is closed, along with the first process taking a k-real step. (That process 
might have already taken a step when Crit was closed.) An adversary is said to be 
restricted when it is k-restricted for every k. 


Notation. We will use the notation A’ (as opposed to A) in the following arguments 
to emphasize when the adversaries considered are k-restricted. 


Lemma 3.6.5 for every process i, for every round k > 1, and for every (k — 1)- 
round run p,*4 


{ A’; A’ is k-restricted and Pa, |P(k)| =m, ie P(k) | N(k), te-1 = p| > o} 


= {A; Pa[|P(h)] =m, i PUR) | Nk), m1 =p] = 1}. 


Proor. Note first that, as by assumption an adversary is deterministic, random- 
ness can affect the decisions of Player(2) only through the information Player(2) 
receives from Player(1): the strategy of Player(1) — i.e., the algorithm — is indeed 
randomized. A moment of thought based on the description of the function f given 
on page 65 shows furthermore that, for Player(2), the only visible affects of random- 
ness are whether a process in Try or Rem enters in Crit when scheduled while the 
critical section is free. In particular, in round k Player(2) follows a deterministic 
behavior until it learns the region new; (either Try or Crit) reached by the first 


“*Using Convention 8.1.1, page 198, we set P[.B|A] = 0 whenever P[A]=0. 
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process 2, scheduled to take a k-real step. We will use this fact in both directions 
of the proof. 


For a k-restricted adversary, the set P(k) is by definition defined during that pe- 
riod where A behaves deterministically. This implies that, for every k-restricted 
adversary A’, we have Py[|P(k)| = m,i € P(k) | N(k), te-1 = p] > 0 only if 
Pal|P(k)| =m, i € P(k) | N(&), t,-1 = p] = 1. This establishes that 


{ A’; A’ is k-restricted and Py) |P(k)| =m,t€ P(k) | N(k), m1 = p| > o} 
C {A; Pal |P(A)| =m, ie P(k) | NUE), te-1 =p] = 1h. 


We now show the converse inclusion. Consider some adversary A such that Pa[|P(4)|- 
= m,ti € P(k) | N(k), me-1 = p] = 1 for some value m. By the property 
mentioned at the beginning of the proof, the adversary follows a deterministic 
behavior until the first k-real step. Call 2, the process taking that first k-real 
step. Let J be the number of processes scheduled up to that point (including 
i,). Obviously | < m. Remark that (conditioned on M(k) and m,_1 = p) t en- 
ters Crit with some non-zero probability when taking its first k-real step. This 
means that Pa[|P(k)| = l,i € P(k) | N(k), m-1 = p] > 0. The assumption 
Pal|P(k)| = m,i € P(k) | N(&), t-1 = p] = 1 then implies that 1 = m. This 


precisely means that A is k-restricted. 


Theorem 3.6.6 The algorithm satisfies strong, run and renew-knowing, 1 and m- 
imposing probabilistic no-lockout. Equivalently, for every process i, for every round 
k > 1, for every m <n and for every (k — 1)-round run p we have: 

inf Pal Wi(k) | N(k), m1 = p] = — 
A;P.a[|PUs)l=m, ieP(b) | WC), me-1=0 [=I 3m 
Pa[lN (kb), tTr-1=p ]>0 


Proor. This result is a simple consequence of Proposition 3.6.7 whose proof is 
presented next. In that proposition one considers the set of k-restricted adversaries 
A such that Pal N(k), t-1 = p,t € P(k), |P(k)| = m] > 0. This condition is 
equivalent to the conjunction of the two inequalities P4[N(k), t,-1 = p] > 0 and 
Pali € P(k), |P(k)| = m | N(k), te-1 = p|] > 0. By Lemma 3.6.5 the set of 
conditions 

A restricted adversary 

Pali e Plk), [P(k)| =m | N(k), m1 = p| > 0 

Pal N(k), Te-1 = p| > 0 
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is equivalent to 
| Pali P(k), |P(h)| =m | Nb), ma =p] =1 
Pal N(k), Tepoy = p | > 0, 


where no a priori restriction applies here to A. Theorem 3.6.6 is therefore a direct 
consequence of Proposition 3.6.7. 


Note that we proved along that if the adversary is m-imposing and renew-knowing 
then the distinction between an i-knowing and an ?-imposing adversary disappears. 
More formally 


{ A; Pal Bs | Bi, Bo, Ba] = 1} = {As Pal Bs, Bi | Bo, Ba] = 1}. 


This shows that the way the precondition PB, is formalized is inconsequential for 
m-imposing and renew-knowing adversaries. On the other hand, as we saw already 
several times, the adequate*® formalization of the precondition By is obtained by 
probabilistic conditioning. This establishes that the measure used in Theorem 3.6.6 
is as adequate*® as it can be, provided that the adversary is m-imposing and renew- 
knowing. These two restrictions are brought to solve the two problems revealed in 
Theorems 3.6.1 and 3.6.2, respectively. 


Proposition 3.6.7 Let i be a process, k a round number, and p be a(k —1)-round 
run. For concision of notation we let All denote the event schema {N(k), t-1 = 


p,t€ P(k), |P(k)| = m}. We have: 


2 
i , ; = ! = > — . 
pith PA [ Wilk) | Nk), m1 = p, 1 PCh), |P(k)] = m| > = 
Proor. We will make constant use of the notation [n] = {1,2,...,n}. Also, for 


any sequence (a;);en we will write a; = Umaxa; to mean that 7 is the only index in 
3 
J for which a; = Maxa,. 
ped 
We first define the events U(k) and Uj(k), where J is any subset of {1,...,n}: 
U(k) = {Fie P(k) st. B(k) = Max B;(k)}, 
JE P(R) 


Ui(k) S {Ae Js.t. B(k) = Max 3;(h)}. 


“We never used the fact that we were conditioning on Bz so that the same equality holds without 
the mention of Bz. As is discussed in page 69, this correspond to analyzing the system at the point 
8, equal to the beginning of the execution. 

“©The term adequate is used in the sense defined on page 53. 
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The main result established in [49] can formally be restated as: 


Ym <n, P|Ufny (k)] > 2/3. (3.3) 


Following the general proof technique described in the introduction we will prove 
that : 


Pu [U(k) | N(h), ta = p, 1 P(E), [P(h) = m] = PLU Ch) |, 
and that: 
Pa [ Wilk) | N(k), tea = 8 € PCA), [PC)| = mh) | 


= P| Bi(k)= Max B,(k)| Uf,(h) . 


3 €[m] 


The events involved in the LHS of the two inequalities (e.g., Wi(k), U(k), {|P(k)| = 
m}, {te-1 = p}, {2 © P(k)}) depend on A’ whereas the events involved in the RHS 
are pure mathematical events over which A’ has no control. 


We begin with some important remarks. 


(1) By definition, the set P(k) = {t1, 22,...} is decided by the restricted adversary 
A’ at the beginning of round &: for a given A’ and conditioned on {7,_; = p}, the 
set P(k) is defined deterministically. In particular, for any i, Pa[ i € P(k) | 
T,-1 = p|has value 0 or 1. Similarly, there is one value m for which P4)[|P(k)| = 
m | m,-1 = p] = 1. Hence, for a given adversary A’, if the random event 
{N(k), Te-1 = p, t€ P(k), |P(k)| = m} has non zero probability, it is equal to the 
random event {N(k), 1 =p} = 1. 

(2) Recall that, in the modified version of the algorithm that we consider here, a 
process 7? draws a new lottery value in round k exactly when R;(k—1) # R(k). Hence, 
within J, the event N(k) is equal to {R;,(k—1) # R(k),..., Ri, (kK-1) F R(k)}. On 
the other hand, by definition, the random variables (in short r.v.s) 9,,; 1; € P(k) 
are iid and independent from the r.v. R(k). This proves that, (for a given A’), 
conditioned on {7,1 = p}, the r.v. A(k) is independent from all the r.v.s {j;,. 
Note that Up(h) is defined in terms of (i.e., measurable with respect to) the 
(G:,; t) € P(k)), so that UpQy(k) and N(k) are also independent. 


(3) More generally, consider any r.v. X defined in terms of the (;,; 7; € P(k)): 
X = f(0i,,.-.,;,,) for some measurable function f. Recall once more that the 
number m and the indices i,,...,i, are determined by {7,_, = p} and A’. The 
r.v.s 3;, being iid, for a fixed A’, X then depends on {a,_; = p} only through 
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the value m of |P(k)|. Formally, this means that, conditioned on |P(k)|, the r.v.s 

X and {,-1 = p} are independent: Ea [X | m1 =p] = Ea [X | |P(k)| = 

m] = E[f(G1,..-,8m)]. (More precisely, this equality is valid for the value m for 

which Pa[a,_-1 = p , |P(&)| = m] #0.) A special consequence of this fact is that 

PalUpey(h) | T1 = p= PU(A )I- 

Remark that, in U(k), the event W;(k) is the same as the event {B;(k) = Umax B;(k)}. 
3 

This justifies the first following equality. The subsequent ones are commented after- 

wards. Also, the set J that we consider here is the one having a non zero probability 

described in Remark Mh above. 


Pal W,(k) | U(k), F] 
= hbo - Uns By( (k) | Uk), 1] 
= Pol.G{k) = Umax d(k) | Ueye\b) Le (3.4) 
= Pal B(k) = Umax 3;( k) | Upay(k), t-1 =p] (3.5) 


Equation 3.4 is true because we condition on N(k) and because U(k) A N(k) = 
Up, (k). Equation 3.5 is true because V’(k) is independent from the r.v.s ;, as is 
shown in Remark (2) above. 


We then notice that the events {3;(k) = Umax Bi(k )f and Up (,)(k) (and hence their 


JE P(K) 
intersection) are defined in terms of the r.v.s ;,. From remark (3) above, the value 


of Eq. 3.5 PW ali on m and is Wa | ia of 7. Hence, for all 7 and 
jin P(k), Pa l[W, k) | U(k) Ij= PalW. k)| U(k) I}. 

On the other -" Sscruy Pall )= “vines ) | Upyy)(k), Ti =p lah 
indeed, one of the (;, has to attain the maximum. 


These last two facts imply that, Ve € va 


Pa l[W, k) | U(k I | = 1/m. 
We now turn to the evaluation of va va I). 
= Pit | m1 =p] 
= PLU (A) 
> 2/3. (3.8) 


Equation 3.6 is true because we condition on NV(k). Eq. 3.7 is true because Up(,)(h) 
and (k) are independent (See Remark (2) above). The equality of Eq. 3.8 stems 
from Remark (3) above and the inequality from Eq. 3.3. 
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We can now finish the proof of Proposition 3.6.7. 


We discuss here the lessons brought by our results. (1) Conditioning on N(k) is 
equivalent to force the algorithm to refresh all the variables at each round. By 
doing this, we took care of the undesirable lingering effects of the past, exemplified 
in Theorems 3.6.2 and 3.6.4. (2) It is not true that: 


P4[5i(k) = Max (kt) | Upay(e), [P(e)| = m| 


J € P(k) 
= P[G(k) = Max 3i(h) | Ujna(h) | 


ie., that the adversary has no control over the event {§;(k) = Max 3;(k)}- (This 
3 


was Rabin’s statement in [49].) Indeed, the latter probability is equal to 1/m 
whereas we proved in Theorem 3.6.1 that there is an adversary for which the former 
is 0 when 2< m<n. 


The crucial remark explaining this apparent paradox is that, implicit in the expres- 


sion P4[G;(k) = Max §,(k)|...], is the fact that the random variables 3;(k) (for 


jE P(k) 


j € P(k)) are compared to each other in a specific way decided by A, before one of 
them reveals itself to be the maximum. For instance, in the example constructed 
in the proof of Theorem 3.6.1, when j takes a step, 3;(k) is compared only to the 
Ok); '< 7, and the situation is not symmetric among the processes in P(k). 


But, if the adversary is restricted as in our Definition 3.6.2, or if equivalently, proba- 
bilistic conditioning is done on |P(k)| = m, the symmetry is restored and the strong 
no-lockout property holds. 


Rabin and Kushilevitz used these ideas from our analysis to produce their algo- 
rithm [36]. 


Our Theorems 3.6.1, 3.6.2 and 3.6.4 explored how the adversary can gain and use 
knowledge of the lottery values held by the processes. The next theorem states that 
the adversary is similarly able to derive some knowledge about the round numbers, 
contradicting the claim in [49] that “because the variable R is randomized just 
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before the start of the round, we have with probability 0.99 that R; # R.” Note 
that, expressed in our terms, the previous claim translates into R(k) # R;(k — 1). 
Note also that the next measure is adequate, in the sense defined in page 53. 


Theorem 3.6.8 There exists an adversary A, a round k, a step number t, a run 
pi, compatible with A, having t steps and in which round k is under way such that 


Pal R(k) # Ri(k—1) | m= pi] < .99. 


Proor. We will write p, = p'p where p’ is a k — 1-round run and p is the run 
fragment corresponding to the kth round under way. Assume that p’ indicates that, 
before round &, processes 1, 2,3,4 participated only in round & —1, and that process 
5 never participated before round &. Furthermore, assume that during round k — 1 
the following pattern happened: A waited for the critical region to become free, 
then allocated one step in turn to processes 2,1,1,3,3,4,4; at this point 4 entered 
the critical region. (All this is indicated in p’.) Assume also that the partial run 
p into round k indicates that the critical region became free before any competing 
process was given a step, and that the adversary then allocated one step in turn to 
processes 5,3,3, and that, after 3 took its last step, the critical section was still free. 
We will establish that, at this point, 


Pal R(k) # Ri(k— 1) | m = p’p] < 99. 


By assumption & — 1 is the last (and only) round before round & where processes 
1,2,3 and 4 participated. Hence Ri(k — 1) = Ro(k — 1) = R3(k — 1) = R(k- 1). 
To simplify the notations we will let R’ denote this common value. Similarly we 
will write 6), (5,...in place of 3,(k — 1), G(k — 1),... We will furthermore write 
G1, Bo,...in place of Gi(k), Bo(k),... and B, Rin place of B(k), R(k). 


Using Bayes rule gives us: 


Pal RZ R’ |p’) Palp| eo RF RB] 


PalR#R'| p', p]= Palp| p'] 


(3.9) 


In the numerator, the first term P4[ Rk # R’ | p’] is equal to 0.99 because R is 
uniformly distributed and independent from R’ and p’. We will use this fact another 
time while expressing the value of Pa[p | p’]: 


Palp | p'] 
= Palp| ep RAR] PalRFR | #'] 
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+ Palp |p, R= RK] Pal R= R’'| p'] 
= 0.99 Palp |p, RFR] (3.10) 
+ 0.01 Palp | p', R= R’). 


e Consider first the case where R # R’. Then process 3 gets a YES answer when 
going through the test “(V.R # Rs) or (V.B < B3)”, and consequently chooses a 
new value B3(k) = 33. Hence 


Palp | p', R# RK) = Pl Bs < Bs}. (3.11) 


e Consider now the case Rk = Rk’. By hypothesis, process 5 never participated in the 
computation before round k and hence draws a new number B;(k) = 35. Hence: 


Palp | p', R= R') = Pal Bs(k) < Bs | p', R= RP’). (3.12) 


As processes 1,...,4 participated only in round k— 1 up to round &, the knowledge 
provided by p’ about process 3 is exactly that, in round & — 1, process 3 lost to 
process 2 along with process 1, and that process 2 lost in turn to process 4, i.e., that 
BS < BS, BL < BS and 25 < 64. For the sake of notational simplicity, for the rest 
of this paragraph we let X denote a random variable whose law is the law of /3, 
conditioned on {5 > Max{/3}, 85}, 35 < 04}. This means for instance that, Vz € R, 


P[X > oe] =P A> e| Gh > Mat, 6}, Bh < Hi]. 


When 3 takes its first step within round k, the program variable V.B holds the value 
G5. As a consequence, 3 chooses a new value when and exactly when B3(k—1)(= 34) 
is strictly bigger then (5. (The case 34 = 35 would lead 3 to take possession of the 
critical section at its first step in round &, in contradiction with the definition of p; 
and the case 34 < 5 leads 3 to keep its “old” lottery value B;(k — 1).) From this 
we deduce that: 


Pal Bs(k) < Bs | p', R= R'] = P[BS< 85 | G3 < X] 
+ P95 > Bs, Bs < Bs | 83 < X}. (3.13) 


Using Lemma 8.1.5 we derive that: 


PL 85 < Bs | Bs < X|> P[B5 < G5]. 


On the other hand P[ {5 < 85] = Pls < G5] because all the random variables 
Oi(j),t=1,...,n, 7 > 1 are iid. Taking into account the fact that the last term of 
equation 3.13 is non zero, we have then established that: 


Pa[ Bs(k) < Bs | p', R= R'] > PBs < fs). (3.14) 
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Combining Equations 3.11, 3.12 and 3.14 yields: 
Palp|p', R= R'|> Palp|p', RAR’). 


Equation 3.10 then shows that Pa[p | p'] > Palp |p’, R # R’). Plugging this result 
into Equation 3.9 finishes the proof. 


We finish with a result showing that all the problems that we encountered in Rabin’s 
algorithm carry over for Ben-Or’s algorithm. Ben-Or’s algorithm is cited at the end 
of [49]. The code of this algorithm is the same as the one of Rabin with the following 
modifications. All variables B,R, B;, Rj; 1 <% <n are boolean variables, initially 
0. The distribution of the lottery numbers is also different but this is irrelevant for 
our discussion. 


We show that Ben-Or’s algorithm does not satisfy the weak no-lockout property of 
Definition 3.5.2. The situation is much simpler then in the case of Rabin’s algorithm: 
here all the variables are boolean so that a simple reasoning can be worked out. 


Theorem 3.6.9 (Ben-Or’s Alg.) There is an adversary A, a step number t and 
arun p, compatible with A such that 


P| Wo(k) | Tt = Pts 2€ P(k) | =0. 


Proor. Assume that we are in the middle of round 3, and that the run p; indicates 
that (at time 0 the critical section was free and then that) the schedule 12 2 3 
3 was followed, that at this point 3 entered in Crit, that it left Crit, that at this 
point the schedule 4 1 1 5 5 was followed, that 5 entered and then left Crit, that 6 
4 4 then took a step and that at this point Crit is still free. 


Without loss of generality assume that the round number &(1) is 0. Then R2(1) = 0, 
By(1) = 1 and B.(1) = 0: if not 2 would have entered in Crit. In round 2 it then 
must be the case that R(2) = 1. Indeed if this was not the case then 1 would have 
entered the critical section. It must then be the case that B,(2) = 0 and B,(2) = 1. 
And then that Bg(3) = 1 and R(3) = 0: if this was not the case then 4 would have 
entered in Crzt in the 3rd round. 


But at this point, 2 has no chance to win if scheduled to take a step! 


3.7 Appendix 


This section presents some useful mathematical properties of the truncated expo- 
nential distribution used in [49]. Theorem 3.7.1 and its corollaries are used in the 
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construction of the adversary in Theorem 3.6.2 and Theorem 3.6.4. 
Definition 3.7.1 For any sequence (a;)icn we denote Max, a; we Max{a,,d2,..., a5}. 


In this section the sequence ((3;) is a sequence of iid geometric random variables: 


1 


P[j, = 1] = gi lah. 

The following results are about the distribution of the extremal function Max, (j;. 
The same probabilistic results hold for iid random variables (/3/), having the trun- 
cated distribution used in [49]: we just need to truncate at log,n + 4 the random 
variables @; and the values that they take. This does not affect the probabilities 
because, by definition, P[ i(k) = log,n + 4] P[ GB; =U]. We will need 


the following function: 


— DLislogyn44 
da) 1-6,” (3.15) 
Theorem 3.7.1 For alls € N and x € R such that logs + « € N and such that 
(2)!-# < 1/2, we have the following approximation: 
A® P[ Maz, 3; > logs t+ r]~1l—e? . 


l-«x ygl-« 
< e? 4— 


A bound on the error is given by A ~(1- eo") 
Proor. We easily see that, Vj € N, P[Max,8; < j] = (1 — 2')*. Setting 
j = log,s+ 2 gives: 


2i-*ys i 
P| Max,f; < log,s + 4] = (1 -5 ) wer, 


The upper bound on the error term is obtained by writing a precise asymptotic 


expansion of (1 - zie) 


The upper bound on the error shows that this approximation is very tight when s is 
big. In the construction of Theorem 3.6.2 we consider the case where where s ~ n/2 
and « = —1/2logn. The error term is then less then e~””. As an illustration of 
the theorem we deduce the two following results. 


Corollary 3.7.2 Consider s > 1. Then P{ Maz,Q; > |log,s| —4] > 1-7. 
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Proor. Write |log,s| = log,s —t with 0 <¢< 1. Then 


P| Max,f; = |log,s| —4] > P| Max,(; > logs — (44+ ¢)] 


_olt(4tt) _ 
~ l-e? >1—e7*, 


Corollary 3.7.3 Consider s > 1. Then P| Max,3; > [log,s| +8] < 0.01. 


Proor. 1—e-2 ~2-7< 0.01. 


These results express that the maximum of s random variables {; is concentrated 
tightly around log,s: Corollary 3.7.2 shows that the maximum is with overwhelming 
probability at least as big as |log.s| — 4, whereas Corollary 3.7.3 shows that with 
probability 99% this maximum is at most [log,s] + 7. 


Corollary 3.7.4 Let s > 1. Then P| Mar, 8; = [log,s] | > 017. Fora > 1 


fors>1landa=1,...,5. 


P| Max,3; = [logs] 4 a > -¢'(a+ 2). Hence P| Maw, 8; = [log,s| +a| > 0.005, 


Proor. Let x € (0,1) such that log,s + « = [log,s|. (Recall that the random 
variables 3; are integer valued.) Then P| Max,@; = log,s + a] ~ ¢(a) — (a 4+ 1). 
This is equal to —d/(¢) = log,2 e~?' °2!~$ for some ¢ € (x, 2 +1) C (0,2). We check 
immediately that @” is negative on (—oo, 1) and positive on (1,00). This allows us 
to write that 


P| Max, ; = [log.s] | > Min(—¢'(0), -¢'(2)) > 0.17. 


The same argument gives also that Va > 1 P| Max, 3; = [log.s] + a| = —¢'(C) for 
some ¢ € (a,a+ 2). In this interval —¢'(¢) > —@/(a + 2). Hence Va = 1,...,5, 


P| Max, 9; = [logs] + a| > —$'(7) ~ log, 2e7? (27-7 > 0.005. 


Chapter 4 


Proving ‘Time Bounds for 
Randomized Distributed 
Algorithms 


4.1 Introduction 


This chapter is devoted to the analysis of the timed version of Lehmann-Rabin’s 
Dining Philosophers algorithm [37]. We consider the case where, by assumption, 
a participating processes cannot wait more then time 1 before taking a step. The 
scheduling of the processes, i.e., the order under which the various processes take 
steps, is not in the control of the algorithm. According to our general paradigm 
(see Chapter 1) we therefore let Player(2) decide the schedules. We will prove that 
Lehmann-Rabin’s algorithm verifies a strong correctness property, i.e., a property 
holding against Player(2) knowing the whole past execution. 


As discussed in page 26 in the section about randomized adversaries, randomization 
does not make the adversary more powerful and is not needed to establish the cor- 
rectness of a given algorithm. (We argued nevertheless that considering randomized 
adversaries was very useful for establishing lower bounds.) We will therefore restrict 
ourselves in this chapter to the case of deterministic adversaries. 


Furthermore, following the discussion of page 31, we consider the model where 
Player(2) controls the passage of time. (We showed in page 31 that we could 
equivalently allocate the time control to Player(1).) 


We can summarize the previous discussion by saying that the admissible adversaries 
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are deterministic, know the past execution and do not let a participating process wait 
more then time 1 for a step. (This does not mean that these properties characterize 
completely the set of admissible adversaries. We will for instance also require that 
admissible adversaries let processes exit from their critical section.) 


We showed in page 33 that the model for randomized computing presented in [54] 
was equivalent to the model presented in Definition 2.3.1 in the case where 1) 
Player(1) knows the complete state of the system and remembers it, and 2), deter- 
ministic adversaries are considered. We therefore can and will equivalently develop 
the analysis of [37] in the model of [54]. 


The original correctness property claimed by Lehmann and Rabin in [37] was that for 
all admissible adversaries, the probability that the execution is deadlocked is equal 
to zero. The authors of [37] did not write a formal transcription of this property. 
In particular they never made explicit what was the event-schema! associated to 
the informal description “the execution is deadlocked”. (Note that such a property 
involves infinitely many random tosses.) They similarly did not provide a formal 
proof of correctness making explicit the probability spaces (Qu,G4, Pa) described 
in our Section 2.4. A more formal proof is therefore needed. 


The introduction of time in the proof of correctness presents three advantages. The 
first one is that it will allow us to work “over a finite horizon of time” instead of the 
whole infinite execution. This represents a major simplification of the setting within 
which the proof is conducted. The second advantage is that the timed results are 
interesting in their own right and provide more insight on the rate at which progress 
occurs during an execution. (The correctness statement presented in [37] states in 
essence that progress eventually occurs with probability one.) Last but not least, to 
establish our result we develop a new general method based on progress functions 
defined on states, for proving upper bounds on time for randomized algorithms. 
Our method consists of proving auxiliary statements of the form U a U', which 


means that whenever the algorithm begins in a state in set U, with probability p, 
it will reach a state in set U’ within time ¢. Of course, this method can only be 
used for randomized algorithms that include timing assumptions. A key theorem 
about our method is the composability of these U — U’ arrows, as expressed by 


Theorem 4.3.2. This composability result holds even in the case of (many classes 


of) non-oblivious adversaries. 


We also present two complementary proof rules that help in reasoning about sets 
of distinct random choices. Independence arguments about such choices are often 
crucial to correctness proofs, yet there are subtle ways in which a non-oblivious ad- 


‘see page 43 for a definition of an event-schema 
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versary can introduce dependencies. For example, a non-oblivious adversary has the 
power to use the outcome of one random choice to decide whether to schedule an- 
other random choice. Our proof rules help to systematize certain kinds of reasoning 
about independence. 


As mentioned above, we present our proof in the context of the general framework 
[54] for describing and reasoning about randomized algorithms. This framework in- 
tegrates randomness and nondeterminism into one model, and permits the modeling 
of timed as well as untimed systems. The model of [54] is, in turn, based on existing 
models for untimed and timed distributed systems [30, 41], and adopts many ideas 
from the probabilistic models of [58, 27]. 


Using this general method we are able to prove that 7 ve C, where 7 is the set 


of states in which some process is in its trying region, while C is the set of states in 
which some process is in its critical region. That is, whenever the algorithm is in 
a state in which some process is in the trying region, with probability 1/8, within 
time 13, it will reach a state in which some process is in its critical region. This 
bound depends on the timing assumption that processes never wait more then time 
1 between steps. A consequence of this claim is an upper bound (of 63) on the 
expected time for some process to reach its critical region. 


For comparison, we already mentioned that [37] contains only proof sketches of 
the results claimed. The paper [62] contains a proof that Lehmann and Rabin’s 
algorithm satisfies an eventual progress condition, in the presence of an adversary 
with complete knowledge of the past; this proof is carried out as an instance of Zuck 
and Pnueli’s general method for proving liveness properties. Our results about this 
protocol can be regarded as a refinement of the results of Zuck and Pnueli, in that 
we obtain explicit constant time bounds rather than liveness properties. 


The rest of the paper is organized as follows. Section 4.2 presents a simplified version 
of the model of [54]. Section 4.3 presents our main proof technique based on time- 
bound statements. Section 4.4 presents the additional proof rules for independence 
of distinct probabilistic choices. Section 4.5 presents the Lehmann-Rabin algorithm. 
Section 4.6.2 formalizes the algorithm in terms of the model of Section 4.2, and gives 
an overview of our time bound proof. Section 4.7 contains the details of the time 
bound proof. 


Acknowledgments. Sections 4.2, 4.3 and 4.4 were written by Roberto Segala. 
The references in the subsequent proofs to execution automata, and to the event 
schemas Unit-Time, rirsT(a,U) and NExtT(a,U) are also his contribution. (The 
notation U — U' and Theorem 4.3.2 is part of the work of the author.) 
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4.2 The Model 


In this section, we present the model that is used to formulate our proof technique. 
It is a simplified version of the probabilistic automaton model of [54]. As mentioned 
in page 33, this model considers the case where 1) Player(1) knows and remembers 
the complete state of the system, and 2), deterministic adversaries are considered. 
Under these conditions it is equivalent to the model presented in Chapter 2. Here 
we only give the parts of the model that we need to describe our proof method and 
its application to the Lehmann-Rabin algorithm; we refer the reader to [54] for more 
details. 


Definition 4.2.1 A probabilistic automaton? M consists of four components: 


e aset states(M) of states 
e a nonempty set start(M) C states(M) of start states 


e an action signature stg(M) = (eat(M), int(.M)) where ezt(M) and int(M) are 
disjoint sets of external and internal actions, respectively 


e atransition relation steps(M) C states(.M) x acts(.M) x Probs( states( states(M))), 
where the set Probs( states(states(.M ))) is the set of probability spaces (Q, F, P) 
such that Q C states(M) and F = 2°. The last requirement is needed for tech- 
nical convenience. 


A probabilistic automaton is fully probabilistic if it has a unique start state and from 
each state there is at most one step enabled. 


Thus, a probabilistic automaton is a state machine with a labeled transition relation 
such that the state reached during a step is determined by some probability distri- 
bution. For example, the process of flipping a coin is represented by a step labeled 
with an action flip where the next state contains the outcome of the coin flip and 
is determined by a probability distribution over the two possible outcomes. A prob- 
abilistic automaton also allows nondeterministic choices over steps. An example of 
nondeterminism is the choice of which process takes the next step in a multi-process 
system. 


An execution fragment a of a probabilistic automaton M is a (finite or infinite) 
sequence of alternating states and actions starting with a state and, if the execution 


?In [54] the probabilistic automata of this definition are called simple probabilistic automata. 
This is because that paper also includes the case of randomized adversaries. 
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fragment is finite, ending in a state, a = 5941 5,d.5_---, where for each 2 there exists 
a probability space (Q, F, P) such that (s;,a;41,(Q,F, P)) € steps(M) and s;4, € Q. 
Denote by fstate(a) the first state of a and, if a is finite, denote by Istate(a) the 
last state of a. Furthermore, denote by frag”(M) and frag(.M) the sets of finite and 
all execution fragments of M, respectively. An execution is an execution fragment 
whose first state is a start state. Denote by exec*(M) and exec( M) the sets of finite 
and all executions of M, respectively. A state s of M is reachable if there exists 
a finite execution of M that ends in s. Denote by rstates(M) the set of reachable 
states of M. 


A finite execution fragment a, = s9a,5,---@,8, of M and an execution fragment 
Q2 = SyAn418n41°°° Of M can be concatenated. In this case the concatenation, 
written a, - Qs, is the execution fragment 5901S) +++ @nSn4n41$n41°°*. An execution 
fragment a, of M is a prefix of an execution fragment ay of M, written a, < ao, if 
either a; = Gy or a, is finite and there exists an execution fragment a) of M such 
that ag = a,-a. 


In order to study the probabilistic behavior of a probabilistic automaton, some 
mechanism to remove nondeterminism is necessary. To give an idea of why the 
nondeterministic behavior should be removed, consider a probabilistic automaton 
with three states 59, 5,,52 and with two steps enabled from its start state s 9; the 
first step moves to state s, with probability 1/2 and to s2 with probability 1/2; 
the second step moves to state s; with probability 1/3 and to s) with probability 
2/3. What is the probability of reaching state s,? The answer depends on how 
the nondeterminism between the two steps is resolved. If the first step is chosen, 
then the probability of reaching state s, is 1/2; if the second step is chosen, then 
the probability of reaching state s,; is 1/3. We call the mechanism that removes 
the nondeterminism an adversary, because it is often viewed as trying to thwart the 
efforts of a system to reach its goals. In distributed systems the adversary is often 
called the scheduler, because its main job may be to decide which process should 
take the next step. 


Definition 4.2.2 An adversary for a probabilistic automaton M is a function A 
taking a finite execution fragment of M and giving back either nothing (represented 


as 6) or one of the enabled steps of M if there are any. Denote the set of adversaries 
for M by Advsy;?. 


Once an adversary is chosen, a probabilistic automaton can run under the control 
of the chosen adversary. The result of the interaction is called an execution automa- 


*In [54] the adversaries of this definition are denoted by DAdvsy, where D stands for Determin- 
isttc. The adversaries of [54] are allowed to use randomness. 
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ton. The definition of an execution automaton, given below, is rather complicated 
because an execution automaton must contain all the information about the differ- 
ent choices of the adversary, and thus the states of an execution automaton must 
contain the complete history of a probabilistic automaton. Note that there are no 
nondeterministic choices left in an execution automaton. 


Definition 4.2.3 An execution automaton H of a probabilistic automaton M is a 
fully probabilistic automaton such that 


1. states( H) C frag"(M). 


2. for each step (a,a,(Q,F, P)) of H there is a step (Istate(a), a, (Q’, F’, P’)) of 
M, called the corresponding step, such that 0 = {aas|s € 0’} and P’[aas] = 
P{s] for each s € 0. 


3. each state of H is reachable, i.e., for each a € states(H) there exists an 
execution of H leading to state a. 


Definition 4.2.4 Given a probabilistic automaton M, an adversary A € Advsy,, 
and an execution fragment a € frag"(M), the execution H(M,A,a) of M under 
adversary A with starting fragment a is the execution automaton of M whose start 
state is a and such that for each step (a’,a,(Q,F,P)) € steps( H(M,A,a)), its 
corresponding step is the step A(a’). 


Given an execution automaton H, an event is expressed by means of a set of maximal 
executions of H, where a maximal execution of A is either infinite, or it is finite 
and its last state does not enable any step in H. For example, the event “eventually 
action @ occurs” is the set of maximal executions of H where action a does occur. 
A more formal definition follows. The sample space Q, is the set of maximal 
executions of 1. The o-algebra Fy is the smallest o-algebra that contains the set 
of rectangles R,, consisting of the executions of Qy having a as a prefix’. The 
probability measure Py is the unique extension of the probability measure defined 
on rectangles as follows: Py[R.] is the product of the probabilities of each step of H 
generating a. In [54] it is shown that there is a unique probability measure having 
the property above, and thus (Qy, Fy, Py) is a well defined probability space. For 
the rest of this abstract we do not need to refer to this formal definition any more. 


Events of Fy are not sufficient for the analysis of a probabilistic automaton. Events 
are defined over execution automata, but a probabilistic automaton may generate 


“Note that a rectangle Ra, can be used to express the fact that the finite execution a occurs. 
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several execution automata depending on the adversary it interacts with. Thus a 
more general notion of event is needed that can deal with all execution automata. 
Specific examples are given in Section 4.3. 


Definition 4.2.5 An event schema e for a probabilistic automaton is a function 
associating an event of Fy with each execution automaton H of M. 


We now discuss briefly a simple way to handle time within probabilistic automata. 
The idea is to add a time component to the states of a probabilistic automaton, 
to assume that the time at a start state is 0, to add a special non-visible action 
vy modeling the passage of time, and to add arbitrary time passage steps to each 
state. A time passage step should be non-probabilistic and should change only the 
time component of a state. This construction is called the patient construction in 
[44, 57, 22]. The reader interested in a more general extension to timed models is 
referred to [54]. 


We close this section with one final definition. Our time bound property for the 
Lehmann-Rabin algorithm states that if some process is in its trying region, then 
no matter how the steps of the system are scheduled, some process enters its critical 
region within time ¢ with probability at least p. However, this claim can only be 
valid if each process has sufficiently frequent chances to perform a step of its local 
program. Thus, we need a way to restrict the set of adversaries for a probabilistic 
automaton. The following definition provides a general way of doing this. 


Notation. We let Advs denote a subset of Advsa,. 


4.3. The Proof Method 


In this section, we introduce our key statement U —aaws U' and the composability 


theorem, which is our main theorem about the proof method. 


The meaning of the statement U —aaws U’ is that, starting from any state of U 
and under any adversary A of Advs, the probability of reaching a state of U' within 
time t is at least p. The suffix Advs is omitted whenever we think it is clear from 
the context. 


Definition 4.3.1 Let ey: be the event schema that, applied to an execution au- 
tomaton A, returns the set of maximal executions a of H where a state from U’ is 
reached in some state of a within time t. Then U —aaws U’ iff for each s € U and 


each A € Advs, Priu,a,s)lev'4(H(M, A, s))] > p. 
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Proposition 4.3.1 Let U,U',U” be sets of states of a probabilistic automaton M. 
LU — U', then UU U" — U'uU", 


In order to compose time bound statements, we need a restriction for adversary 
schemas stating that the power of the adversary schema is not reduced if a prefix of 
the past history of the execution is not known. Most adversary schemas that appear 
in the literature satisfy this restriction. 


Definition 4.3.2 An adversary schema Advs for a probabilistic automaton M is 
execution closed if, for each A € Advs and each finite execution fragment a € 
frag" (M), there exists an adversary A’ € Advs such that for each execution fragment 
a’ € frag’(M) with Istate(a) = fstate(a’), A’(a’) = A(a-a’). 


Theorem 4.3.2 Let Advs be an execution closed adversary schema for a probabilis- 
tic timed automaton M, and let U,U',U" be sets of states of M. 
LU Sr Advs U' and U' Ads U", then U dus U", 


Sketch of proof: Consider an adversary A € Advs that acts on M starting from 
a state s of U. The execution automaton H(M,A,s) contains executions where a 
state from U" is reached within time ¢,. Consider one of those executions a and 
consider the part H of H(M,A,s) after the first occurrence of a state from U’ in 
a. The key idea of the proof is to use execution closure of Advs to show that there 
is an adversary that generates H, to use U' dus U" to show that in H a state 


from U" is reached within time ¢, with probability at least p2., and to integrate 
this last result in the computation of the probability of reaching a state from U” in 
H(M,A,s) within time t, + to. = 


4.4 Independence 


Example 4.4.1 Consider any distributed algorithm where each process is allowed 
to flip fair coins. It is common to say “If the next coin flip of process P yields head 
and the next coin flip of process Q yields tail, then some good property @ holds.” 
Can we conclude that the probability for ¢ to hold is 1/4? That is, can we assume 
that the coin flips of processes P and @ are independent? The two coin flips are 
indeed independent of each other, but the presence of non-oblivious adversaries may 
introduce some dependence. An adversary can schedule process P to flip its coin 
and then schedule process Q only if the coin flip of process P yielded head. As a 
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result, if both P and @Q flip a coin, the probability that P yields head and Q yields 
tail is 1/2. 


Thus, it is necessary to be extremely careful about independence assumptions. It 
is also important to pay attention to potential ambiguities of informal arguments. 
For example, does ¢ hold if process P flips a coin yielding head and process Q does 
not flip any coin? Certainly such an ambiguity can be avoided by expressing each 
event in a formal model. 


In this section we present two event schemas that play a key role in the detailed 
time bound proof for the Lehmann-Rabin algorithm, and we show some partial 
independence properties for them. The first event schema is a generalization of 
the informal statement of Example 4.4.1, where a coin flip is replaced by a generic 
action a, and where it is assumed that an event contains all the executions where 
ais not scheduled; the second event schema is used to analyze the outcome of the 
first random draw that occurs among a fixed set of random draws. A consequence 
of the partial independence results that we show below is that under any adversary 
the property ¢ of Example 4.4.1 holds with probability at least 1/4. 


Let (a,U) be a pair consisting of an action of M and a set of states of M. The 
event schema FIRST(a,U) is the function that, given an execution automaton H, 
returns the set of maximal executions of H where either action a does not occur, or 
action a occurs and the state reached after the first occurrence of a is a state of U. 
This event schema is used to express properties like “the z'* coin yields left”. For 
example a can be flip and U can be the set of states of M where the result of the 
coin flip is left. 


Let (a1, U1),...,(@n,U,) be a sequence of pairs consisting of an action of M anda 
set of states of M such that for each 7,7, 1 <i< 7 <n, a; # a;. Define the event 
schema NEXT((a1,U1),...,(dn,U,)) to be the function that applied to an execution 
automaton H gives the set of maximal executions of H where either no action from 
{a,,...,a@,} occurs, or at least one action from {a,,...,a,} occurs and, if a; is the 
first action that occurs, the state reached after the first occurrence of a; is in U;. 
This kind of event schema is used to express properties like “the first coin that is 
flipped yields left.” 


Proposition 4.4.2 Let H be an execution automaton of a probabilistic automaton 
M. Furthermore, let (a,,U1),..., (dn, Un) be pairs consisting of an action of M and 
a set of states of M such that for each i,j, 1<ti<j <n, a; # a;. Finally, let 
Piy-++5Pn be real numbers between 0 and 1 such that for each i, 1 <i<n, and each 
step (s,a,(Q,F,P)) € steps(M) with a = a;, the probability P[U; AQ] is greater 
than or equal to p;, i.e., PU; Q] > p;. Then 
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1. Py[(FIRST(a,, U,) +--+ FIRST(a,,U;,))(H)| > pis Pas 


2. Py[NEXT((a,,U1),---,(dn,Un))()] > min(p,..., Pn). 


4.5 The Lehmann-Rabin Algorithm 


The Lehmann-Rabin algorithm is a randomized algorithm for the Dining Philoso- 
phers problem. This problem involves the allocation of n resources among n com- 
peting processes arranged in a ring. The resources are considered to be interspersed 
between the processes, and each process requires both its adjacent resources in or- 
der to reach its critical section. All processes are identical; the algorithm breaks 
symmetry by using randomization. The algorithm ensures the required exclusive 
possession of resources, and also ensures that, with probability 1, some process is 
always permitted to make progress into its critical region. 


Figure 4.1 shows the code for a generic process 7. The n resources are represented by 
n shared variables Res;,..., Res,, each of which can assume values in {free, taken}. 
Each process ¢ ignores its own name, t?, and the names, Res;_,; and Res;, of its 
adjacent resources. However, each process 7 is able to refer to its adjacent resources 
by relative names: Resi 1ezt) is the resource located to the left (clockwise), and 
Res; right) is the resource to the right (counterclockwise) of 7. Each process has a 
private variable u;, which can assume a value in {left, right}, and is used to keep 
track of the first resource to be handled. For notational convenience we define an 
operator opp that complements the value of its argument, i.e., opp(right) = left 
and opp(left) = right. 


The atomic actions of the code are individual resource accesses, and they are rep- 
resented in the form <atomic-action> in Figure 4.1. We assume that at most one 
process has access to the shared resource at each time. 


An informal description of the procedure is “choose a side randomly in each iteration. 
Wait for the resource on the chosen side, and, after getting it, just check once for 
the second resource. If this check succeeds, then proceed to the critical region. 
Otherwise, put down the first resource and try again with a new random choice.” 


Each process exchanges messages with an external user. In its idle state, a process 
is in its remainder region R. When triggered by a try message from the user, it 
enters the competition to get its resources: we say that it enters its trying region 7. 
When the resources are obtained, it sends a crit message informing the user of the 
possession of these resources: we then say that the process is in its critical region 
C. When triggered by an exit message from the user, it begins relinquishing its 
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Shared variables: Res; € {free, taken}, j = 1,...,n, initially free. 
Local variables: wu; € {left,right}, i=1,...,n 
Code for process ?: 
0. try ** beginning of Trying Section ** 
1. <u; — random> ** choose left or right with equal probability ** 
2. < if Res;;,,) = free then 
Res¢j,u,) = taken ** pick up first resource ** 
else goto 2. > 
3. < if Res; opp(u;)) = free then 
Res¢j,opp(u;)) <= taken; ** pick up second resource ** 
goto 5. > 
4, < Res(j.,) >= free; goto 1.> ** nut down first resource ** 
5. crit ** end of Trying Section ** 
** Critical Section ** 
6. exit ** beginning of Exit Section ** 
7. <u; — left or right ** nondeterministic choice ** 
Res(i,opp(u,)) s= Free > ** put down first resources ** 
8. < Res¢j.,) = free > ** put down second resources ** 
9. rem ** end of Exit Section ** 


** Remainder Section ** 


Figure 4.1: The Lehmann-Rabin algorithm 


resources: we then say that the process is in its exit region #. When the resources 
are relinquished its sends a rem message to the user and enters its remainder region. 


4.6 Overview of the Proof 


In this section, we give our high-level overview of the proof. We first introduce 
some notation, then sketch the proof strategy at a high level. The detailed proof is 
presented in Section 4.7. 
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4.6.1 Notation 


In this section we define a probabilistic automaton M which describes the system 
of Section 4.5. We assume that process 2+ 1 is on the right of process 7 and that 
resource Res; is between processes 7 and i+ 1. We also identify labels modulo n so 
that, for instance, process n + 1 coincides with process 1. 


A state s of M is a tuple (X1,..., Xn, Resi,..., Res,,t) containing the local state X; 
of each process t, the value of each resource Res;, and the current time ¢. Each local 
state X; is a pair (pc;, u;) consisting of a program counter pc; and the local variable 
u;. The program counter of each process keeps track of the current instruction in 
the code of Figure 4.1. Rather then representing the value of the program counter 
with a number, we use a more suggestive notation which is explained in the table 
below. Also, the execution of each instruction is represented by an action. Only 
actions try,, crit;, rem;, exit; below are external actions. 


Number pe; Action name Informal meaning 


0 R try; Remainder region 

1 flip, Ready to Flip 

2 W wait; Waiting for first resource 

3 $ second, Checking for Second resource 

4 D drop, Dropping first resource 

5 P crit; Pre-critical region 

6 CC exit; Critical region 

7 Ep dropf, Exit: drop First resource 

8 Es drops, Exit: drop Second resource 

9 ER rem; Exit: move to Remainder region 


The start state of M assigns the value free to all the shared variables Res;, the 
value R to each program counter pe;, and an arbitrary value to each variable w;. 
The transition relation of M is derived directly from Figure 4.1. For example, for 
each state where pe; = F there is an internal step flip, that changes pe, into W 
and assigns left to u; with probability 1/2 and right to u; with probability 1/2; 
from each state where X; = (W,left) there is a step wait; that does not change 
the state if Res; 1ert) = taken, and changes pe; into S and Res¢ eft) into taken if 
Res(i1ert) = free; for each state where pc; = Ep there are two steps with action 
dropf;: one step sets u; to right and makes Res; eft) free, and the other step 
sets uj to left makes Res; rignt) free. The two separate steps correspond to a 
nondeterministic choice that is left to the adversary. For time passage steps we 
assume that at any point an arbitrary amount of time can pass; thus, from each 
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state of M and each positive 6 there is a time passage step that increases the time 
component of 6 and does not affect the rest of the state. 


The value of each pair X; can be represented concisely by the value of pc; and an 
arrow (to the left or to the right) which describes the value of w;. Thus, informally, 
a process it is in state 5 or D (resp. S or D) when 2 is in state S' or D while 
holding its right (resp. left) resource; process 7 is in state W (resp. W) when i 
is waiting for its right (resp. left) resource to become free; process i is in state 
Es (resp. Ls) when 7 is in its exit region and it is still holding its right (resp. 
left) resource. Sometimes we are interested in sets of pairs; for example, whenever 
pc; = F the value of u; is irrelevant. With the simple value of pc; we denote the set 
of the two pairs {(pc;, left), (pc;,right)}. Finally, with the symbol # we denote 
any pair where pc, € {W,$,D}. The arrow notation is used as before. 


For each state s = (Xo,..., Xn_1, Resi,..., Res,_1,¢) of M we denote by X;(s) the 
pair X; and by Res,(s) the value of the shared variable Res; in state s. Also, for any 
set S of states of a process 7, we denote by X; € 5, or alternatively X; = S' the set of 
states s of M such that X;(s) € S$. Sometimes we abuse notation in the sense that 
we write expressions like X; € {F,D} with the meaning X; € F UD. Finally, we 
write X; = EF for X; = {Er, Es, Ex}, and we write X; = T for X; € {F,W,5, D, P}. 


A first basic lemma states that a reachable state of M is uniquely determined by 
the local states its processes and the current time. Based on this lemma, our further 
specifications of state sets will not refer to the shared variables; however, we consider 
only reachable states for the analysis. The proof of the lemma is a standard proof 
of invariants. 


Lemma 4.6.1 For each reachable state s of M and eacht, 1 <i<n, Res; = taken 
iff Xi(s) € {5,D,P,C, Ep, Es} or Xi4i(s) € {5,D,P,C, Ep, Es}. Moreover, 
for each reachable state s of M and each 1, 1 < 1 < n, tt is not the case that 


X;(8) € {5,D,P,C, Ep, Es} and Xi44(s) € {5,D,P,C, Ep, Es}, i.e., only one 


pa 


process at a time can hold one resource. 


4.6.2. Proof Sketch 


In this section we show that the RL-algorithm guarantees time bounded progress, 
i.e., that from every state where some process is in its trying region, some process 
subsequently enters its critical region within an expected constant time bound. We 
assume that each process that is ready to perform a step does so within time 1: 
process 7 is ready to perform a step whenever it enables an action different from 
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try, or exit;. Actions try, and exit; are supposed to be under the control of the 
user, and hence, by assumption, under the control of the adversary. 


Formally, consider the probabilistic timed automaton M of Section 4.6.1. Define 
Unit-Time to be the set of adversaries A for M having the properties that, for 
every finite execution fragment a of M and every execution a’ of H(M,A,a), 1) 
the time in a’ is not bounded and 2) for every process i and every state of a’ 
enabling an action of process 7 different from try, and exit,, there exists a step in 
a’ involving process 7 within time 1. Then Unit-Time is execution-closed according 
to Definition 4.3.2. An informal justification of this fact is that the constraint that 
each ready process is scheduled within time 1 knowing that a-a' has occurred only 
reinforces the constraint that each ready process is scheduled within time 1 knowing 
that a’ has occurred. Let 


T = {s€ rstates(M)|1,;Xi(s) ¢ {T}} 


denote the sets of reachable states of 4 where some process is in its trying region, 
and let 
C = {s € rstates(M) | 4;X;(s) = C} 


denote the sets of reachable states of 14 where some process is in its critical region. 
We show that 
13 . . 
T Ts Unt Time C, 


ie., that, starting from any reachable state where some process is in its trying 
region, for all the adversaries of Unit-Time, with probability at least 1/8, some 
process enters its critical region within time 13. Note that this property is trivially 
satisfied if some process is initially in its critical region. 


Our proof is divided into several phases, each one concerned with the property of 
making a partial time bounded progress toward a “success state”, i.e., a state of 
C. The sets of states associated with the different phases are expressed in terms of 


T,RT,F,G,P,andC. Here, 
RT = {seT| ViX,(s) € {Er, R, TH} 


is the set of states where at least one process is in its trying region and where no 
process is in its critical region or holds resources while being in its exit region. 


F > {se€RT|AXi(s) = F} 


is the set of states of RT where some process is ready to flip a coin. 


P = {s € rstates(M) | 4,X;(s) = P} 
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is the sets of reachable states of MM where some process is in its pre-critical region. 
The set G is the most important for the analysis. It parallels the set of “Good 
Pairs” in [62] or the set described in Lemma 4 of [37]. To motivate the definition, we 
define the following notions. We say that a process i is committed if X; € {W, 5}, 
and that a process i potentially controls Res; (resp. Res;_,) if X; € {W, 5, D} 
(resp. X; € {W, 5, D}). Informally said, a state in RT is in G if and only if 
there is a committed process whose second resource is not potentially controlled by 
another process. Such a process is called a good process. Formally, 


b) 


Ww Ss and Xi41(8) E {Er, R, Ff, #}, or 
Wis 
37S 


G = {se RT|A; Xi(s) 
X; and X;_1(s) E {Ep, R, Ff, #3} 


(s) 


Reaching a state of G is a substantial progress toward reaching a state of C. Actually, 
the proof of Proposition 4.7.11 establishes that, if ¢ a is good process, then, with 
probability 1/4, one of the three processes i— 1,7 and i+ 1 soon succeeds in getting 
its two resources. The hard part is to establish that, with constant probability, 
within a constant time, G is reached from any state in 7. A close inspection of the 


Et } 
Et } 


proof given in [62] shows that, there, the timed version of the techniques used is 
unable to deliver this result. The phases of our proof are formally described below. 


T —=+RTUC (Proposition 4.7.3), 
RT ++ FUP (Proposition 4.7.15), 
F—-GuUP (Proposition 4.7.14), 
( 
( 


1/2 
G—+P 


1/4 


P+C 


Proposition 4.7.11), 
Proposition 4.7.1). 


The first statement states that, within time 2, every process in its exit region re- 
linquishes its resources. By combining the statements above by means of Proposi- 
tion 4.3.1 and Theorem 4.3.2 we obtain 
T —C, 
1/8 
which is the property that was to be proven. Using the results of the proof summary 
above, we can furthermore derive a constant upper bound on the expected time 
required to reach a state of C when departing from a state of 7. Note that, departing 
from a state in RT, with probability at least 1/8, P is reached in time (at most) 10; 


with probability at most 1/2, time 5 is spent before failing to reach GU P (“failure 
at the third arrow”); with probability at most 7/8, time 10 is spent before failing 
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to reach P (“failure at the fourth arrow”). If failure occurs, then the state is back 
into RT. Let V denote a random variable satisfying the following induction 


V =1/8-104+1/2(5+ VY) +3/8(10+ V3) , 


where V; and V2 are random variables having the same distribution as V. The 
previous discussion shows that the expected time spent from RT to P is at most 
E[V]. By taking expectation in the previous equation, and using that E[V] = 
E[V,] = E[V2], we obtain that E[V] = 60 is an upper bound on the expected time 
spent from RT to P, and that, consequently, the expected time for progress starting 
from a state of 7 is at most 63. 


4.7 The Detailed Proof 


We prove in this section the five relations used in Section 4.6.2. However, for the sake 
of clarity, we do not prove the relations in the order they were presented. Through- 
out the proof we abuse notation by writing events of the kind rirsT(flip,, left) 
meaning the event schema FIRST(flip,, {s € states(M) | X,(s) = W}). 


Proposition 4.7.1 If some process is in P, then, within time 1, it enters C, i.e., 


P-+C. 
1 


Proor. This step corresponds to the action crit: within time 1, process 7 informs 
the user that the critical region is free. 


Lemma 4.7.2 If some process is in its Exit region then, within time 3, it will enter 


R. 


Proor. The process needs to take first two steps to relinquish its two resources, 
and then one step to send a rem message to the user. 


Proposition 4.7.3 7 —=.RTUC. 


Proor. From Lemma 4.7.2 within time 2 every process that begins in Fp or Es 
relinquishes its resources. If no process begins in C or enters C' in the meantime, 
then the state reached at this point is a state of RT; otherwise, the starting state 
or the state reached when the first process enters C' is a state of C. 
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We now turn to the proof of G at P. The following lemmas form a detailed cases 


analysis of the different situations that can arise in states of G. Informally, each 
lemma shows that some event of the form of Proposition 4.4.2 is a sub-event of the 
properties of reaching some other state. 


Lemma 4.7.4 


1, Assume that X;_, € {Ep, R, F} and X; = W. If virst(flip;_,,left), then, 
within time 1, either X;_, = P or X; =S. 


2. Assume that X;-; = D and X; = W. If FirsT(flip;_,,left), then, within 
time 2, either X;_, = P or X;=S. 


3. Assume that X;-, = S and X; = W. If FirsT(£lip;_,,left), then, within 
time 3, either X;_, = P or X;=S. 


4. Assume that X;_, = W and X; = W. If FIRST(flip,_,,left), then, within 
time 4, either X;_, = P or X;=S. 


Proof. The four proofs start in the same way. Let s be a state of M satisfying the 
respective properties of items / or 2 or 3 or 4. Let A be an adversary of Unit- Time, 
and let a be the execution of M that corresponds to an execution of H(M, A, {s}) 
where the result of the first coin flip of process i — 1 is left. 


1. By hypothesis, 2 — 1 does not hold any resource at the beginning of a and 
has to obtain Res;_2 (its left resource) before pursuing Res;_,;. Within time 
1, 2 takes a step in a. If ¢— 1 does not hold Res;_, when i takes this step, 
then ¢ progresses into configuration $. If not, it must be the case that 7-1 
succeeded in getting it in the meanwhile. But, in this case, Res;_; was the 
second resource needed by 2 — 1 and 7 — 1 therefore entered P. 


2. If X; = S within time 1, then we are done. Otherwise, after one unit of time, 
X; is still equal to W, ie., X,(s') = Ww for all states s’ reached in time 1. 
However, process 7 — 1 takes also a step within time 1. Let a = a, - a» such 
that the last action of a, corresponds to the first step taken by process 7 — 1. 
Then X;_;(fstate(a)) = F and X;(fstate(a2)) = W. Since process 7 — 1 did 
not flip any coin during a,, from the execution closure of Unit-Time and item 
1 we conclude. 
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3. If X; = S within time 1, then we are done. Otherwise, after one unit of time, 
X; is still equal to W, ie., X;(8’) = Ww for all states s’ reached in time 1. 
However, also process 7 — 1 takes a step within time 1. Let a = a, - a» such 
that the last action of a, corresponds to the first step taken by process 7 — 1. 
If X;_1(fstate(az)) = P then we are also done. Otherwise it must be the case 
that X;_,(fstate(az)) = D and X;(fstate(a,)) = W. Since process i — 1 did 
not flip any coin during a,, from the execution closure of Unit-Time and item 
2 we conclude. 


4. If X; = S within time 1, then we are done. Otherwise, after one unit of 
time, X; is still equal to W, ie., X;(s’) = Ww for all states s’ reached in time 
1. However, since within time 1 process 7 checks its left resource and fails, 
process 2 — 1 gets its right resource within time 1, and hence reaches at least 
state S. Let a = a, -a@,_ where the last step of a, is the first step of a leading 
process i — 1 to state $. Then X;_)(fstate(a2)) = $ and X;(fstate(az)) = W. 
Since process 7 — 1 did not flip any coin during a,, from the execution closure 
of Unit-Time and item 3 we conclude. 


Lemma 4.7.5 Assume that X;_, € {Er, R,T} and X; = W. IfvirsT(flip,;_,, left), 
then, within time 4, either X;_, = P or X;=S. 


Proor. The lemma follows immediately from Lemma 4.7.4 after observing that 
X;_1 € {Er, R,T} means X;_, € {Fr, R, F,W,S,D,P}. 


The next lemma is a useful tool for the proofs of Lemmas 4.7.7, 4.7.8, and 4.7.9. 


Lemma 4.7.6 Assume that X; € {W, S} or X; € {ER, R, F, D} with FIRST(flip,, 
left), and assume that X;41 € {W, 5} or Xigi € {En, R, F, D} with rirst(flip;,,, 
right). Then the first of the two processesi or i+1 testing its second resource enters 
P after having performed this test (if this time ever comes). 


Proor. By Lemma 4.6.1 Res; is free. Moreover, Res; is the second resource needed 
by both t and «+ 1. Whichever tests for it first gets it and enters P. 


Lemma 4.7.7 Jf X; = Ss and Xj41 € {W, 5} then, within time 1, one of the two 
processes i or i+ 1 enters P. The same result holds if X; € {W, Ss} and Xj41 = S. 


Proor. Being in state $, process 7 tests its second resource within time 1. An 
application of Lemma 4.7.6 finishes the proof. 
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Lemma 4.7.8 Assume that X; = & and Xj4, € {Ep, R, F, D}. If PIRST(f1lip,,1, 


right), then, within time 1, one of the two processes i or i+ 1 enters P. The same 
result holds if X; € {Er, R, F,D}, Xi41 = s and FIRST(flip,, left). 


Proor. Being in state S, process i tests its second resource within time 1. An 
application of Lemma 4.7.6 finishes the proof. 


Lemma 4.7.9 Assume that X;_, € {Ex, R,T}, Xi = W, and Xi41 € {Er BR, FLW, 
Dj. Ifvirst(flip,_;,left) and rirst(flip,,,,right), then within time 5 one of 
the three processesi1—1, 7% or1+1 enters P. 


Proor. Let s be a state of M such that X;_1(s) € {Er, R,T}, Xi(s) = W, and 
Xisi(s) € {Ee RF, W, D}. Let A be an adversary of Unit-Time, and let a be 
the execution of M that corresponds to an execution of H(M,A,{s}) where the 
result of the first coin flip of process 1 — 1 is left and the result of the first coin 
flip of process 2+ 1 is right. By Lemma 4.7.5, within time 4 either process 7 — 1 
reaches configuration P in @ or process 2? reaches configuration & ina. Ifi-1 
reaches configuration P, then we are done. If not, then let a = a, - ay such that 
Istate(a,) is the first state s' of a with X,(s’) = 5. Ift+ 1 enters P before the 
end of a, then we are done. Otherwise, X;41(fstate(a2)) is either in {W, 5 } or 
it is in {Fpr, R,F, D} and process t+ 1 has not flipped any coin yet in a. From 
execution closure of Unit-Time we can then apply Lemma 4.7.6. Within one more 
time process 7 tests its second resource and enters P if process 7+ 1 did not check 
its second resource in the meantime. On the other hand, process +1 enters P if it 


checks its second resource before z does so. 


Lemma 4.7.10 Assume that X;-, € {Ep,R,F,W,D}, X; = W, and Xj41 € 
{Erx, R,T}. If pirst(flip,_,,left) and FirsT(flip,,,,right), then within time 
5 one of the three processes i —1, 2 ori+1, enters P. 


Proor. Analogous to Lemma 4.7.9. 


Proposition 4.7.11 Starting from a global configuration in G, then, with probabil- 
ity at least 1/4 and within time at most 5, some process enters P, Equivalently: 


G—-P. 


1/4 
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Proor. Lemmas 4.7.7 and 4.7.8 jointly treat the case where X; = Ss and Xj41 € 
{Er, R, F, #} and the symmetric case where X;_, € {Fpr,R,F, #} and X; = 
35 Lemmas 4.7.9 and 4.7.10 jointly treat the case where X; = = W and Xj41 € 
(Ep. R. FLW, D} and the symmetric case where X;_; € {Ep, Rol FLW, D} and 
X= W. 

Specifically, each lemma shows that a compound event of the kind rirsT(flip,, x) 
and FrirsT(flip,,y) leads to P. Each of the basic events rirsT(flip,;,x) has prob- 


ability 1/2. From Proposition 4.4.2 each of the compound events has probability at 
least 1/4. Thus the probability of reaching P within time 5 is at least 1/4. 


We now turn to F rr GUP. The proof is divided in two parts and constitute the 


global argument of the proof of progress. 


Lemma 4.7.12 Start with a state s of F. If there exists a process t for which 
X;(s) = F and (Xj-1, Xi41) F (#, #), then, with probability at least 1/2 a state of 
G UP is reached within time 1. 


Proor. The conclusion holds trivially ifs € G. Let s bea state of F—G and let i be 
such that X,(s) = F and (X;_-1, Xi41) F (i, if). Assume without loss of generality 
that X41 #, ie, Xia. € {En, R, F, #}. (The case for X;_1 # # is similar.) We 
can furthermore assume that Xi41 € {Er, R, F, D} since if Xi, € {W, 5} then s 
is already in G. 


We show that the event NEXT((flip,,left),(flip,,,,right)), which by Proposi- 
tion 4.4.2 has probability at least 1/2, leads in time at most 1 toa state of GUP. Let 
A be an adversary of Unit-Time, and let a be the execution of M that corresponds 
to an execution of H(M,A,{s}) where if process 7 flips before process i + 1 then 
process 2 flips left, and if process 7 + 1 flips before process 7 then process 2+ 1 flips 
right. 


Within time 1, 7 takes one step and reaches W. Let j € {t,i+ 1} be the first 
of i and i+ 1 that reaches W and let s, be the state reached after the first time 
process j reaches W. If some process reached P in the meantime, then we are done. 
Otherwise there are two cases to consider. If 7 = 7, then, flip, gives left and 
X;(s1) = Ww whereas X;4, is (still) in {Fpr, R, F, D}. Therefore, s, € G. If7 =7+1, 
then flip,,, gives right and X;41(s,) = W whereas X;(s,) is (still) #’. Therefore, 
8, EG. 
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Lemma 4.7.13 Start with a state s of F. Assume that there exists a process « for 
which X;(s) = F and for which (X;_1(s), Xi41(s)) = (#, #). Then, with probability 
at least 1/2, within time 2, a state of GUP is reached. 


Proor. The hypothesis can be summarized into the form (X;_1(s), X;(s), X;41(s)) 
= (#.,F, #). Since 2— 1 and 2+ 1 point in different directions, by moving to the 
right of i+ 1 there is a process & pointing to the left such that process & + 1 either 
points to the right or is in {Fp, R, F}, ie., X;(s) € {W, 5, D} and Xy41(s) € 
{Epr, R, F, W, 5, D}. If X,(s) € {W, 5 } then s € G and we are done. Thus, we 


can restrict our attention to the case where X;,(s) = Dd. 


We show that the event NEXT((flip,, left), (flip,,,,right)), which by Proposi- 
tion 4.4.2 has probability at least 1/2, leads in time at most 2 toGUP. Let A be 
an adversary of Unit-Time, and let a be an execution of M that corresponds to an 
execution of H(.M,A, {s}) where if process k flips before process & + 1 then process 
i; flips left, and if process & + 1 flips before process & then process & + 1 flips right. 


Within time 2, process & takes at least two steps and hence goes to configuration W. 
Let j € {k,k +1} be the first of k and &+1 that reaches W and let s, be the state 
reached after the first time process 7 reaches W. If some process reached P in the 
meantime, then we are done. Otherwise there are two cases to consider. If 7 = k, 
then, flip, gives left and X;(s,) = W whereas X;,4, is (still) in {Er, R, F, #}. 
Therefore, s; € G. If 7 = k +1, then flip,,, gives right and Xp41(s1) = W 
whereas X;(s;) is (still) in {D,F}. Therefore, s; € G. 


Proposition 4.7.14 Start with a state s of F. Then, with probability at least 1/2, 
within time 2, a state of GU P is reached. Equivalently: 


F—+GuP. 


1/2 


Proor. The two different hypotheses of Lemmas 4.7.12 and 4.7.13 form a partition 
of F. 


Finally, we prove RT =. FUP. 


Proposition 4.7.15 Starting from a state s of RT, then, within time 3, a state of 
FUP is reached. Equivalently: 


RT 3+ FUP. 
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Proor. Let s be a state of RT. If s € F, then we are trivially done: We can 
therefore restrict ourselves to the case where in s each process is in {Ep, R, W, S$, D} 
and where there exists at least one process in {W, $, D}. Furthermore we can restrict 
ourselves to the case where no process reaches P in time 3, i.e., where the state stays 
in RT. (Else we are done.) Let A be an adversary of Unit-Time, and let a be the 
execution of M that corresponds to an execution of H(M,.A, {s}). 


Within time 1 a process reaches {5,D,F}. Therefore Within time 2 a process 
reaches {D, F'}. Therefore Within time 3 a process reaches {F'}. As, by assumption, 
the state stays in RT in time 3, we have therefore proven that F is reached in time 


3. 


Chapter 5 


A Deterministic Scheduling 
Protocol 


In this chapter we present a scheduling problem, analyze it and provide optimal 
deterministic solutions for it. The proof involves re-expressing the problem in graph- 
theoretical terms. In particular the main tool used in the proof of optimality is Ore’s 
Deficiency Theorem [45] giving a dual expression of the size of a maximum matching 
in a bipartite graph. We will consider in Chapter 7 the randomized version of this 
scheduling problem. 


5.1 Introduction 


Many control systems are subject to failures that can have dramatic effects. One 
simple way to deal with this problem is to build in some redundancy so that the 
whole system is able to function even if parts of it fail. In a general situation, the 
system’s manager has access to some observations allowing it to control the system 
efficiently. Such observations bring information about the state of the system that 
might consist of partial fault reports. The available controls might include repairs 
and/or replacement of faulty processors. 


To model the problem, one needs to make assumptions regarding the occurrence 
of faults. Typically, they are assumed to occur according to some stochastic pro- 
cess. To make the model more tractable, one often considers the process to be 
memoryless, i.e. faults occur according to some exponential distribution. However, 
to be more realistic, many complications and variations can be introduced in the 
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stochastic model, and they complicate the time analysis. Examples are: a processor 
might become faulty at any time or only during specific operations; the fault rate 
might vary according to the work load; faults might occur independently among the 
processors or may depend on proximity. The variations seem endless and the results 
are rarely general enough so as to carry some information or methodology from one 
model to another. 


One way to derive general results, independent of the specific assumptions about the 
time of occurrence of faults, is to adopt a discrete time, that, instead of following 
an absolute frame, is incremented only at each occurrence of a fault. Within this 
framework, we measure the maximal number of faults to be observed until the 
occurrence of a crash instead of the maximal time of survival of a system until the 
occurrence of a crash. 


As an introduction to this general situation, we make the following assumptions and 
simplifications: 


Redundancy of the system: We assume the existence of a pool V composed of p 
identical processors from among which, at every time 7, a set 5; of m processors 
is selected to configure the system. The system works satisfactorily as long as 
at least n — m processors among the n currently in operation are not faulty. 
However, the system cannot tolerate more than m faults at any given time: it 
stops functioning if m+ 1 processors among these n processors are faulty. 


Occurrence of faults, reports and logical time: We consider the situation in 
which failures do not occur simultaneously and where, whenever a processor 
fails, a report is issued, stating that a failure has occurred, but without spec- 
ifying the location of the failure. (Reporting additional information might be 
too expensive or time consuming.) Based on these reports, the scheduler might 
decide to reconfigure the system whenever such failure is reported. As a result, 
we restrict our attention to the discrete model, in which time ¢ corresponds to 
the ¢-th failure in the system. 


Repairs: No repair is being performed. 


Deterministic Algorithms: We assume that the scheduler does not use random- 
ness. 


Since the universe consists of only p processors, and one processor fails at each 
time, no scheduling policy can guarantee that the system survives beyond time p. 
(A better a priori upper bound is p—n+m-+1: at this time, only n —m-—1 
processors are still non-faulty. This does not allow for the required quorum of n —m 


5.2. The Model 115 


non-faulty processors.) But some scheduling policies seem to allow the system to 
survive longer than others. An obviously bad policy is to choose n processors once 
and for all and never to change them: the system would then collapse at time m+1. 
This chapter investigates the problem of determining the best survival time. 


This best survival time is defined from a worst-case point-of-view: a given scheduler 
allows the system to survive (up to a certain time) only if it allows it to survive 
against all possible failure patterns in which one processor fails at each time. 


Our informal description so far apparently constrains the faults to occur in on-line 
fashion: for each ¢, the t-th fault occurs before the scheduler decides the set 5,4; to 
be used subsequently. However, since we have assumed that no reports about the 
locations of the faults are available, there is no loss of generality in requiring the 
sets s, to be determined a priori. (Of course, in practice, some more precise fault 
information may be available, and each set s; would depend on the fault pattern up 
to time t.) Also, as we have assumed a deterministic scheduler, we can assume that 
the decisions s;,...,8, are revealed before the occurrence of any fault. We express 
this by saying that the faults occur in an off-line fashion. 


5.2 The Model 


Throughout this chapter, we fix a universal set A’ of processors, and let p denote 
its cardinality. We also fix a positive integer n (n < p) representing the number of 
processors that are needed at each time period, and a positive integer m representing 
the number of failures that can be tolerated (m < n). 


We model the situation described in the introduction as a simple game between two 
entities, a scheduler and an adversary. The game consists of only one round, in 
which the scheduler plays first and the adversary second. The scheduler plays by 
selecting a sequence of p sets of processors (the schedule), each set of size n, and the 
adversary responds by choosing, from each set selected by the scheduler, a processor 
to kill. We consider only sequences of size p because the system must collapse by 
time p, since, at each time period, a new processor breaks down. 


Formally, a schedule S is defined to be a finite sequence, s;,...,5,, of subsets of NV, 
such that |s,| = n for all t, 1 <¢t < p. An adversary A is defined to be a function 
associating to every schedule S = (s),...,8,) a sequence A(S) = (fi,...,f,) of 
elements of NV such that f,; € s, for every t. 


Now let S be a schedule, and A an adversary. Define the survival time, T(S, A), 
to be the largest value of t such that, for all wu < ¢, |{fi,..- fu} sul < m, (where 
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(fi,---5f,) = A(S)). That is, for all time periods u up to and including time period 
t, there are no more than m processors in the set s, that have failed by time w. 


We are interested in the minimum survival time for a particular schedule, with 
respect to arbitrary adversaries. Thus, we define the minimum survival time for a 
schedule, T(S), to be T(S) = ming T(S,A). An adversary A for which T(S) = 
min, 7(S, A) is said to be minimal for S. Finally, we are interested in determining 
the schedule that guarantees the greatest minimum survival time. Thus, we define 
the optimum survival time t,,,, to be maxs 7(S) = maxs miny T(S,A). Also define 
a schedule S to be optimum provided that T(S) = t,,,. Our objectives in this 
chapter are to compute ¢,,, as a function of p, n and m, to exhibit an optimum 
schedule, and to determine a minimal adversary for each schedule. 


5.3 The Result 


Recall that 1 < m < n < p are three fixed integers. Our main result is stated 
in terms of the following function defined on the set of positive real numbers (see 


Figure 5.1): 
def k k + 
hn m(k) = =| m+ («- =| n+m— n) , 
n n 


where (x)+ = max(z,0). In particular, ham(k) = £m when n divides k. 


nm 


The main result of this chapter is: 


Theorem 5.3.1 
topt = hn m(p). 


We will present our proof in two lemmas proving respectively that ¢,,, is no smaller 
and no bigger than hy m(p). 


Lemma 5.3.2 
topt > hn m(p). 


Proor. Consider the schedule S,;,;,; in which the p processors are partitioned into 
|4| batches of n processors each and one batch of q = p—|£| n. Each of the first [4] 
batches is used m time periods and then set aside. Then, the last batch of processors 
along with any n—g of the processors set aside is used for (m+q—n)t time periods. 
It is easy to see that no adversary can succeed in killing m+ 1 processors within a 
batch before this schedule expires. 
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Figure 5.1: The function hy, (hk) 


In order to prove the other direction of Theorem 5.3.1, we need the following result 
about the rate of increase of the function h,, ,(). 


Lemma 5.3.3 For0<k and0 <li <n we have hy m(k) < Rnm(kK+)+n-l-m. 


Proor. Notice first that Ryim(k) = Ram(k +) — m for all k > 0. Moreover, the 
function h increases at a sublinear rate (see Figure 5.1) so that, for p,q > 0, we have 
hn m(pt@) < hnm(p)+¢. Letting p= k+l and q=n-—1, we obtain 


hin m(&) = Pam(k + 2) -— Mm < Pa m(kK+D+n—-—l-m, 


which proves the lemma. 


5.4 The Upper Bound 


In this section we establish the other direction of the main theorem. We begin with 
some general graph theoretical definitions. 


Definition 5.4.1 


e For every vertex v of a graph G, we let y¢(v) denote the set of vertices adjacent 
to v. We can extend this notation to sets: for all sets C of vertices y¢(C) = 


Uvecya(v). 
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e For every bipartite graph G, v(G) denotes the size of a maximum matching 


of G. 


For every pair of positive integers DL, R, a left totally ordered bipartite graph 
of size (L, R) is a bipartite graph with bipartition £,R, where CL is a totally 
ordered set of size L and FR is a set of size Rk. We label £ = {a1,...,a,} so 
that, a; < a; for every 1 < i<j < L. For every £' C £ and R’ C R, the 
subgraph induced by £’ and #’ is a left totally ordered bipartite graph with 
the total order on £ inducing the total order on L’. 


e Let G be a left totally ordered bipartite graph of size (1, R). Fort =1,...,L, 
we let 1;(G') denote the left totally ordered subgraph of G' induced by the 
subsets {a1,d9,...,@:-1} C £ and yg(ar) C R. 


Let us justify quickly the notion of left total order. In this definition, we have in mind 
that £ represents the labels attached to the different times, and that R represents 
the labels attached to the available processors. The times are naturally ordered. 
The main argument used in the proof is to reduce an existing schedule to a shorter 
one. In doing so, we in particular select a subsequence of times. Although these 
times are not necessarily consecutive, they are still naturally ordered. The total 
order on £ is the precise notion formalizing the ordering structure characterizing 
time. 


Consider a finite schedule S = s,,...,5,. In graph theoretic terms, it can be repre- 
sented as a left totally ordered bipartite graph G with bipartition JT = {1,2,...,7} 
and NV = {1,2,...,p}. There is an edge between vertex t € J and vertex 7 € N if 
the processor 7 is selected at time t. The fact that, for all t, |s,| = n translates into 
the fact that vertex t € J has degree n. For such a bipartite graph, the game of the 
adversary consists in selecting one edge incident to each vertex t € T. 


Observe that the adversary can kill the schedule at time ¢ if it has already killed, 
before time t, m of the n processors used at time t. It then kills another one at time 
t and the system collapses. In terms of the graph G, there exists an adversary that 
kills the schedule at time ¢ if and only if the subgraph J,(G) has a matching of size 
m, ie. v(L(G)) > m. Therefore, the set P that we now define represents the set of 
integers LZ and R for which there exists a schedule that survives at time 1, when R 
processors are available. 


Definition 5.4.2 Let L and R be two positive integers. (L, R) € B iff there exists 
a left totally ordered bipartite graph G of size (L,R) with bipartition £ and R 
satisfying the two following properties: 
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1. All vertices in £ have degree exactly equal to n, 


2. For every ¢ = 1,...,|£|, all matchings in [,(G') have size at most equal to 
m—-l,ie. viE(G)) < f- 1. 


The main tool used in the proof of Theorem 5.3.1 is the following duality result for 
the maximum bipartite matching problem, known as Ore’s Deficiency Theorem [45]. 
A simple proof of this theorem and related results can be found in [39]. 


Theorem 5.4.1 Let G be a bipartite graph with bipartition A and B. Then the size 
v(G) of a maximum matching is given by the formula: 


W(G) = min [|B -C] + lye(C)I]- (5.1) 
The following lemma is crucial for our proof. 


Lemma 5.4.2 There are no positive integers L and R such that (L,R) € B and 
such that L > Aym(R). 


Proor. Working by contradiction, consider two positive integers L and R such 
that (L,R) € Band L > h,,,(2). We first show the existence of two integers L’ 
and R’ such that L’ < L, (L’, RB’) € Band L’ > hy (BR). 


Let £= {a,,a2,...,a,} and R = {b,,bo,..., bp} be the bipartition of the graph G 
whose existence is ensured by the hypothesis (L, R) € B. 


We apply Theorem 5.4.1 to the graph [,(G') where we set A = {a1,d9,...,a@,_1} and 
B=yc(az). Let C denote a subset of B for which the minimum in (5.1) is attained. 
(C is possibly empty.) Define L’ = £—({az}Uy1,ca)(C)) and R’ = R-C and let 
L' and R’ denote the cardinalities of £’ and R’. Hence, L' = EL — 1 — |y17,a@)(C)| so 
that LE’ < LE. Consider the bipartite subgraph G’ of G induced by the set of vertices 
L’ UR’. In other words, in order to construct G’ from G, we remove the set C'U {az } 
of vertices and all vertices adjacent to some vertex in C’. We have illustrated this 
construction in Figure 5.2. In that specific example, n = 4, m = 3, L = 6 and 
Rk =7, while h43(7) = 5. One can show that C = {b5, bg, 67} and as a result G’ is 
the graph induced by the vertices {a1, a2, d3, a4, 51, bo, 63, 64}. The graph G’ has size 
(L’, R') = (4,4). 


We first show that (L’, R’) € B. Since the vertices in £’ correspond to the vertices 
of £—{a;,} not connected to C, their degree in G’ is also n. Furthermore, G’, being 
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a3 b; 


a4 by 


Figure 5.2: An example of the construction of G’ from G.. The vertices in C’ are 


darkened. 
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a subgraph of G, inherits property 2 of Definition 5.4.2. Indeed, assume that there 
is a vertex ay in G’ such that [,(G’) has a matching of size m. Let t be the label of 
the corresponding vertex in graph G. Since the total order on £’ is induced by the 
total order on £, [,(G’) is a subgraph of I,(G). Therefore, [,(G') would also have a 
matching of size m, a contradiction. 


Let us show that L’ > hy, .,(.R’). The assumption (L, R) € B implies that m — 1 > 
v(1;(G)). Using Theorem 5.4.1 and the fact that B = y¢(L) has cardinality n, this 
can be rewritten as 


f-1 


IV 


vUr(G@)) = |B- Cl + lina (©) 
n—|C] + nce (©)I- (5.2) 
Since C C BC R, we have that 0 < |C| < n < R and, thus, the hypotheses of 


Lemma 5.3.3 are satisfied for k = R—|C| and 1 = |C|. Therefore, we derive from 
the lemma that 


Pinm( BR’) = Pinm( BR — |C]) < Pnm(R) +n — |C] — f. 
Using (5.2), this implies that 
Prm(B') < Pnm( BR) = lc@(C)] - 1. 
By assumption, L is strictly greater than h,,( 2), implying 
Pn m( BR) < L-1— lincae(O)- 


But the right-hand-side of this inequality is precisely L’, implying that L’ > hy, .,(.R’). 


We have therefore established that for all integers 2 and R such that (L,Rk) € B 
and L > hy m(R), there exists two integers L’ and &’ such that L’ < L, (L', R') eB 
and L’ > Aym(R’). Among all such pairs (L, R), we select the pair for which L is 
minimum. By the result that we just established, we obtain a pair (L’, R’) such that 
(L', RB!) € B and L' < L. This contradicts the minimality of L. 


Lemma 5.4.3 
toot < hn m(p)- 


Proor. By assumption, (t.,.,. V) € B. Hence this result is a direct consequence of 
Lemma 5.4.2 . 
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This Lemma along with Lemma 5.3.2 proves Theorem 5.3.1. 


In the process of proving Lemma 5.3.2 we proved that Si...) is an optimum sched- 
ule. On the other hand, the interpretation of the problem as a graph problem also 
demonstrates that the adversary has a polynomial time algorithm for finding an op- 
timum killing sequence for each schedule S. When provided with S, the adversary 
needs only to compute a polynomial number (actually fewer than p) of maximum bi- 
partite matchings, for which well known polynomial algorithms exist (for the fastest 
known, see [31]). 


5.5 Extensions 


The problem solved in this chapter is a first step towards modeling complex resilient 
systems and there are many interesting extensions. We mention only a few. 


An interesting extension is to consider the case of a system built up of processors 
of different types. For instance consider the case of a system built up of a total of 
n processors, that is reconfigured at each time period and that needs at least g; 
non-faulty processors of type 1 and at least gy non-faulty processors of type 2 in 
order to function satisfactorily. Assume also that these processors are drawn from 
a pool N, of p; processors of type 1 and a pool Ny of ps processors of type 2, that 
Ni, ON> = 0, that that there are no repairs. It is easy to see that the optimum 
survival time ¢,,, is at least the survival time of every strategy for which the number 
of processors of type 1 and type 2 is kept constant throughout. Hence: 


t min (Pn ni—gi (Pt) Pnan2—go(P2))- 


° > max 

ws {(n1,n2)j;n1+n2=n} 
It would be an interesting question whether ¢,,, is exactly equal to this value or very 
close to it. 


Extend the definition of a scheduler to represent a randomized scheduling protocol. 
(Phrased in this context, the result presented in this chapter is only about deter- 
ministic scheduling protocols.) A scheduler is called adversary-oblivious if it decides 
the schedule independently of the choices f,, fo,... made by the adversary. An off- 
line adversary is an adversary that has access to the knowledge of the full schedule 
$1, 59,... before deciding the full sequence s,, 59,... Note that, by definition, off-line 
adversaries make sense only with adversary-oblivious schedulers. By comparison, an 
on-line adversary decides for each time ¢ which processor f, to kill, without knowing 
the future schedule: at each time t the adversary decides f, based on the sole knowl- 
edge of 5,,...,5, and of f,,..., f;-1. In this more general framework, the quantity 
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we want to determine is 
topt = max min E(T(S,A)]. (5.3) 


For an adversary-oblivious, randomized scheduler, one can consider two cases based 
on whether the adversary is on-line or off-line. As is easily seen, if the adversary is 
off-line, randomness does not help in the design of optimal schedulers: introducing 
randomness in the schedules cannot increase the survival time if the adversary gets 
full knowledge of the schedule before committing to any of its choices. As a result, 
the off-line version corresponds to the situation investigated in this chapter. 


It is of interest to study the online version of Problem 5.3. On-line adversaries model 
somewhat more accurately practical situations: faults naturally occur in an on-line 
fashion and the role of the program designer is then to design a scheduler whose ex- 
pected performance is optimum. We study this question for m = 1 in Chapter 7 and 
provide in this case a characterization of the set of optimal randomized scheduling 
policies. 
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Chapter 6 


Establishing the Optimality of a 
Randomized Algorithm 


Proving the precise optimality of a randomized algorithm solving a given problem 
P is always a very difficult and technical enterprise and only very few such proofs 
exist (see [25, 61]). 


A first difficulty is to define an adequate probabilistic model for the analysis of 
the randomized algorithms solving P. This model must take into account that, in 
general, some choices are not in the control of the algorithm considered but, instead, 
controlled by the adversary. It must also reckon with the fact that each randomized 
algorithm uses different random coins and hence carries a different probabilistic 
structure; nevertheless a common probabilistic structure has to be defined allowing 
for the comparison of all the algorithms solving ?. The few papers published so 
far and dealing with lower bounds [25, 35, 33, 61] rarely address this issue. ([25] 
introduces an ad-hoc model for the proof presented there.) The model presented in 
Chapter 2 is, to our knowledge, the first to allow formal proofs of lower-bounds for 
general randomized algorithms. 


A second difficulty is that, for a given problem ?, the set of randomized algorithms 
is infinite in general and hence looking for an optimal randomized algorithm involves 
doing a maximization over an infinite set. 


We let f(z, A) denote the performance of a given randomized algorithm 7 when used 
in conjunction with an adversary A. Examples of typical performances f(a,.A) are 
the expected running time or the probability of “good termination” when the al- 
gorithm a is used in conjunction with the adversary A. By changing changing if 
necessary f(a,A) into —f(7,A) we can always assume that the algorithms 7 are 
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chosen so as to maximize the value f(z,A). The worst case performance of an algo- 
rithm a is given by inf, f(7,A), and therefore the optimal worst case performance 
is given by sup, infy f(7, A). 


As discussed in Chapter 2, the problem of analyzing an algorithm — and proving its 
optimality — is best described in the language of game theory. (See also Section 8.3 
for a presentation of the main notions of Game Theory.) We let Player(1) be the 
entity selecting the algorithm (in short, the algorithm designer) and Player(2) be 
the entity selecting the adversary (in short, the adversary designer). If Player(1) 
selects the algorithm 7m and if Player(2) selects the adversary A, the game played 
consists of the alternative actions of the algorithm and the adversary: Player(1) 
takes all the actions as described by a until the first point where some choice has to 
be resolved by the adversary; Player(2) then takes actions to resolve this choice as 
described by A and Player(1) resumes action once the choice has been resolved... 


Note that, by definition, an algorithm z is defined independently of a given adversary 
A. On the other hand, an adversary might seem to be defined only in terms of a given 
algorithm: the adversary is by definition the entity that resolves all choices not in 
the control of the algorithm considered. If the model allowed for such an asymmetry 
between the notions of algorithm and adversary we could not speak of an adversary 
independently of the algorithm it would be associated to. For reasons that will soon 
be explained, it is critical for our method to model adversaries independently of any 
specific algorithm. In this case the algorithm designer and the adversary designer 
are two well defined players playing a zero sum non-cooperative game. The set of 
strategies II and A are respectively the algorithms a and the adversaries A. The 
rules governing the interaction of the two players during an execution of the game 
are set by the description of the problem P. 


A very delicate matter is the nature of the information about the system held by 
either player when taking a step, and the formal way this information is taken into 
account in the model. Generally, some information about the moves of either player 
is conveyed onto the other player during the execution (i.e., during the game). A 
player having a more precise knowledge of the state of the system is more capable to 
act optimally toward its goal (maximizing or minimizing the performance function 
f(a,A)). The proof of optimality of a player is therefore tantamount to proving 
that, at each round, the player uses optimally the information available in order to 
take its next move. This is in general a very difficult task for which no clear general 
approach seems to exist. Nevertheless, using the concept of saddle point in game 
theory allows us to derive a general proof strategy for proving the optimality of an 
algorithm. We now present and discuss this methodology. 
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If adversaries are each defined independently of any specific algorithm 7, for every 
adversary A we can consider the family (f(z, A))_ obtained by letting 7 range over 
the whole set of algorithms. Therefore, in this case, for every adversary A, the 
quantity sup, f(a,A) is well-defined. 


By Lemma 8.2.2, for every algorithm 7, and every adversary Ap we have inf, f(a, A) 
< sup, f(a,Ao). Furthermore, this inequality is an equality only if 7 is an optimal 
algorithm. This simple fact provides us with the following general proof methodol- 
ogy to attempt to prove that a given algorithm 7 is optimal. 


. Construct a I/A-structure modeling the interaction between Player(1) 
and Player(2). (This means in particular that an adversary is defined 
independently of the choice of any specific algorithm.) 


. Provide an adversary Ap such that inf, f(ao,A) = sup, f(7, Ao). 


By Proposition 8.2.3, the existence of a pair (7, Ao) realizing the equality inf, f(7o, 
A) = sup, f(a, Ao) occurs if and only if max, infy f(7,A) = miny sup, f(z, A). 


(This equality succinctly expresses the three following facts. 1) sup, inf, f(a,A) = 
inf, sup, f(a,A). 2) A protocol a achieves the sup in sup, infy f(7, A)ie., sup, infy 
f(a, A) = max, inf, f(7,A). And 3) an adversary A achieves similarly the inf in 
min, sup, f(a,A) ie., inf, sup, f(a7,A) = miny sup, f(7,A).) 


To find an algorithm and prove its optimality using the previous methodology, we 
are therefore led to model algorithms and adversaries in such a way that the equal- 
ity sup, infy f(7,A) = inf,ysup, f(7,A) holds. There exists two cases where this 
happens: 


Von Neumann: We assume that the set II of strategies of Player(1) is the set 
of probability distributions on a given finite set J and that, similarly, the set 
A of strategies of Player(2) is the set of probability distributions on a given 
finite set J. When saying this, we actually abuse language and identify a 
probability distribution with the procedure consisting in drawing an element 
at random according to this probability distribution. Hence, by convention, 
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for every 7 € I, the strategy 7 consists in drawing an element 7 of I at 
random and according to the distribution 2. Similarly, for every A € A, the 
strategy A consists in drawing an element 7 of J at random and according 
to the distribution A. For every (i,j) € x J and every strategies 7 and A 
resulting in the selection of 7 and 7, a predetermined cost T(i,7) is incurred. 
The performance f(7,A) is by assumption the expected cost FL’, 4[7'] obtained 
under the strategies 7 and A. 


The game just described can be encoded as a matrix game: J and J are the 
sets of pure strategies whereas II and A are the sets of mixed strategies of the 
game. By Von Neumann’s theorem, (see Theorem 8.3.2), max, miny f(a,A) = 
min, max, f(7,A). Recall once more that the finiteness of both J and J is 
critical for this result. 


Strong Byzantine: Assume that the rules of the game played between Player(1) 
and Player(2) specify that, in every execution, Player(2) first learns explicitly 
the strategy 7 chosen by Player(1) before having to commit itself to any action. 
(We could picture this by saying that, by convention, an execution begins with 
Player(1) “sending a message” to Player(2) disclosing the strategy 7 under 
use.) Hence, in this situation, a strategy A for Player(2) is actually a family 
A = (A,)a of strategies A,, one for every strategy 7 of Player(1). We say 
that A, is an adversary specially designed for x. Assume furthermore that the 
performance function is such that, for every adversaries A and A’, for every 
algorithm 7, if A, = Ai then f(7,A) = f(7,A’). This last property allows 
us to extend the definition of f: for every algorithm a and every strategy a 
specially designed for 7, we set f(7,a) = f(a,A) where A is any adversary 
such that A, = a. Assume also that A is stable under reshuffling in the 
following sense. Let (a(7))_ be a family of specially designed adversaries, one 
for every 7 € II. (Hence, by definition, for every 7 and a’ in II, there exists 
an adversary A and an adversary A’ such that A, = a(m) and Al, = a(n’). 
The adversaries A and A’ are a priori different.) Then (a(z))_ is itself an 
admissible adversary, i.e., an element of A. 


The definition A = (A,), immediately shows that an adversary A does not 
depend on the choice of a specific algorithm. Hence the Strong Byzantine 
setting verifies point 1 of our general methodology. 


We now show that, in this setting, sup, inf, f(z,A) = inf, sup, f(7,A) and 
that, therefore, an algorithm 7» is optimal if and only if there exists an ad- 
versary Ap such that infy f(m,A) = sup, f(a7,Ao). This will show that the 
Strong Byzantine setting is well suited for an implementation of our general 
methodology. 
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For every ¢ > 0 and every z in II — the set of strategies of Player(1) — let 
A,(€) be an adversary specially designed for 7 and such that 


f(a, Ar(e)) < inf f(t, A) +e. 


The set A of adversaries being stable under reshuffling we can define an ad- 


versary A(e) by A(e) = (A;(e))_. We have: 


lA 


inf sup f(m, A) sup f (7, A(e)) 
= sup f(r, Ax(e)) 


sup inf f(a, A) +e. 


lA 


The parameter ¢ being arbitrary, this shows that inf, sup, f(a,.A) < sup, infy 
f(a, A). By Lemma 8.2.1 the converse inequality sup, inf, f(a,A) < inf, sup,- 
f(a, A) holds trivially. Hence sup, infy f(7,A) = infysup, f(7,A) which 


concludes the proof. 


We present here an intuitive interpretation of this result. 


Recall first that, as discussed in Page 207, in the expression sup, inf f(7,.A), 
Player(2) can be assumed to learn implicitly the strategy 7 chosen by Player(1). 
Symmetrically, in the expression inf, sup, f(a,.A), Player(1) learns implicitly 
the strategy A chosen by Player(2). Furthermore, as discussed after Equa- 
tion 8.4, Page 206, the strict inequality sup, inf, f(7,A) < inf, sup, f(a, A) 
means precisely that the outcome of the game is different according to which 
of the two players can thus learn its opponent’s strategy. 


If, by construction, Player(2) is informed explicitly of the strategy used by 
Player(1), its knowledge is evidently unaffected by whether it furthermore 
learns this fact implicitly (as in the expression sup, inf, f(7,A)) or not (as 
in the expression inf,sup, f(7,A)). Let Ao be the strategy for Player(2) 
informally described by “Wait for the disclosure of the strategy 7 selected 
by Player(1). Then select an optimal strategy to be adopted for the rest of 
the game.” It is clear that Ap is an optimal strategy for Player(2). Assume 
that Player(2) plays optimally and adopts this strategy and consider the case 
where Player(1) learns implicitly that Player(2) uses strategy Ap. We easily 
see that this knowledge does not confer any advantage to Player(1): Player(1) 
can only derive from it that, for every strategy a it elects, Player(2) chooses 
a corresponding optimal strategy. 
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This establishes that, when Player(2) is informed explicitly of the strategy 
used by Player(1), Player(2) gains no additional advantage in learning im- 
plicitly the strategy 7 used by Player(1); and that, in this case, Player(1) 
gains similarly no advantage in being guaranteed that Player(2) uses its (op- 
timal) strategy Aj. This shows that, when Player(2) is informed explicitly 
of the strategy used by Player(1), the outcome of the game is not affected 
when one or the other player learns implicitly its opponent’s strategy. As ar- 
gued at the beginning of this discussion, this means that sup, infy f(a,A) = 
inf, sup, f(7,A) for every Strong Byzantine game. 


As a short aside and to illustrate the generality of our proof methodology we show 
that the complicated proof of given in [25] falls in the framework of the Strong 
Byzantine case of the methodology. (In [25], Graham and Yao consider the Byzan- 
tine Generals problem with 3 processes, one of which is faulty.) 


By assumption Player(2) knows the algorithm a selected by Player(1). (This point 
is never stated explicitly in [25]: the authors of [25] just mention that they “have 
incorporated the capability for faulty processes to collude, to spy on all commu- 
nication lines and to wait for messages transmitted by non-faulty processes in the 
current round to arrive before making decisions on their own messages.” Neverthe- 
less the strategies 04,0, and o¢ of Player(2) are described in terms of 7.) Hence, 
as discussed in page 128, a strategy A of Player(2) is actually a family (A,), and 
does not depend on the choice of a specific z. 


The performance function is defined to be 
f(a, A) = P,.4, [good termination] , 


where P, 4, is the probability on the set of executions induced by the algorithm 
a and the adversary A, specially designed for 7 and associated to A. The event 
good termination is a special event defined in terms of the game studied in [25]. 
The definition of f(z,A) shows immediately that f(a,A) = f(a,A’) if A, = At. 
Furthermore, the set A of adversaries considered in [25] is by assumption stable 
under reshuffling. We are therefore in the Strong Byzantine setting of the general 
methodology. We now summarize the proof presented in [25] and show that it follows 
precisely our general methodology. 

The proof of [25] is organized as follows. A specific algorithm 7p is first given 
for which the quantity performance(m)) = inf. f(1o,A) is easily derived.! A spe- 
cific (but very complex) strategy Ap for Player(2) is then described. In order to 


1T his algorithm is actually called Ao in [25]. We use ao to be consistent with the rest of our 
discussion. 
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implement strategy Ap,” Player(2) uses critically its knowledge of the strategy 7 
used by Player(1): at every point of the game (i.e., of the execution) Player(2) 
selects its next move by emulating 7 under certain conditions. Working by in- 
duction on the number of rounds of the algorithm a selected by Player(1), [25] 
then shows that, for every 7, f(a,Ao) < performance(m). This implies that 
sup, f(a,Ao) < performance(m) = inf, f(a, A). By Lemma 8.2.2, the converse 
inequality sup, f(a, Ao) > infy f(7o, A) is trivially true. Hence 


sup (7, Ao) = inf f(t, A), 


which establishes the second point of our general proof methodology and therefore 
proves that 7, is optimal. 


A natural question is whether the two previous settings, although different in form, 
are truly different. In slightly more precise terms, the question is whether the 
existence of a proof of optimality of a given algorithm 7» in one of the two settings 
implies the existence of a proof of optimality of 7 in the other setting. 


The following argument tends to suggest a similarity between the two settings. (At 
least when the performance function f(a, .A) is equal to the expected value F, 4[T] of 
a random variable 7: recall that the Von Neumann setting requires this condition.) 


Let (G,I[, A) be a game® between Player(1) and Player(2) with a performance 
function f. Consider all the possible modifications of this game obtained by provid- 
ing Player(2) during the execution of the game with some information about the 
strategy a followed by Player(1). All these different games yield the same value 
sup, inf, f(a,A), because, as discussed in Page 207, Player(2) can be assumed 
to learn implicitly the complete strategy 7 in the expression sup, inf, f(7,A): re- 
ceiving some complementary explicit information about a does not then raise its 
knowledge. This shows that there is a whole spectrum of models for the adver- 
sary and that to all of them is attached the same class of optimal algorithms. The 
two settings presented above, the “Von Neumann setting” and the “Strong Byzan- 
tine setting”, correspond to two extreme situations where Player(2) receives only a 
bounded number of bits of information about 7 in the course of an execution, and 
where it receives the complete description of 7 at the very beginning of the game. 
The argument above seems to suggest that the two settings are equally good to 
establish the optimality of a randomized algorithm. 


? More precisely, in order to implement Ao,7, the adversary associated to Ap and specially de- 
signed for z. 
"See page 205 for a discussion on game theory. 
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We now discuss two examples, the algorithm of [25], (again), and the scheduling 
algorithm presented in Chapter 7. These examples reveal that the choice of the 
setting actually influences greatly the proof of optimality of a randomized algorithm. 


Consider first the scheduling problem considered in Chapter 7. The performance 
function f(7,A) considered is the expected value F,.4[7] of a random variable T 
called the survival-time. In this game, Player(2) does not know a priori the strategy 
selected by Player(1). This is formally seen in the model presented in Section 7.2: 
at each time t, the view of Player(2) contains the schedule s,,...,s5; previously 
selected by Player(1) but contains no additional information about the algorithm 
a that generated that schedule. We prove that all the algorithms in the set Prog, 
defined in page 151 are optimal. Our discussion above therefore shows that these 
algorithms would similarly be optimal if Player(2) was endowed with the spying 
capability and learned the strategy selected by Player(1) at the beginning of the 
execution. Nevertheless, the proof that we present uses critically that Player(2) 
does not have this capability: if Player(2) was modeled as knowing the algorithm 
az, our Lemma 7.3.2 would not be true and all the results of Section 7.6 would not 
hold anymore. 


We consider now the Byzantine Generals problem of [25] and show that, in contrast 
to the previous example, both the Strong Byzantine setting and the Von Neumann 
setting can be used to formalize the proof given by Graham and Yao. 


The performance function f(7,A) considered in [25] is the probability of termination 
with agreement on a correct value when Player(1) selects selects the algorithm 7 
and Player(2) plays according to the strategy A. A probability being a special 
case of an expected value, the performance function f(a7,A) is the expected value 
E,,,a[T] of some variable schema* T. 


We argued on page 130 that Graham and Yao use the strong byzantine setting in 
their proof: their Player(2) uses as a black box the algorithm a chosen by Player(1) 
in order to generate its own outputs in the course of the execution. Nevertheless an 
even more careful reading of their proof reveals that Player(2) does not need the 
full knowledge of 7 but just needs to have access to finitely many values produced 
by a. Hence we could consider a setting where, by convention, Player(1) would 
provide Player(2) with those values: in that case Player(2) would need no additional 
knowledge about 7. As argued above on page 131, in this modified game — where 
Player(1) gives some partial information about its strategy — the algorithm 7 of [25] 
is still optimal. But we are now in the Von Neumann setting (when analyzing 
algorithms terminating in finitely many rounds). 


“The definition of a variable schema is given in Definition 2.4.1. 
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We have thus argued that, by reducing the information transmitted to Player(2) 
from the complete description of a to only finitely many bits of information we 
could adapt the proof of [25] given in the Byzantine setting into one given in the 
Von Neumann setting. We could go further and consider the case where Player(1) 
does not cooperate with Player(2) and provides Player(2) with no information about 
its strategy (except for what can be “naturally” deduced from an execution). The 
discussion given above on page 130 shows as before that the algorithm 7 of [25] is 
still optimal. Nevertheless, in this case, the proof of [25] does not apply and it is 
not clear at all how a direct proof would then proceed. 


These two examples show that the choice of setting is far from innocent and influ- 
ences greatly the proof of optimality of a randomized algorithm. We present in the 
next theorem a result establishing formally that the two settings are in some cases 
incompatible. 


Theorem 6.0.1 Let P be a problem, let II be the set of randomized algorithms 
solving P and A be the set of adversaries. Assume that II contains more then 
one element. Assume also the interaction between the two players modeled to allow 
Player(2) to know the algorithm m under use. Then the Von Neumann setting cannot 
be used to model the interaction between Player(1) and Player(2). 


Proor. Note first that the Von Neumann setting applies only if the sets II and 
A of strategies contain all the convex combinations of their elements: for every 
strategies 7, and 7, in II, for every non-negative numbers a, and a, summing to 
one, Q,7, +272 is also in IL. (Recall that, by definition, in the Von Neumann setting, 
the strategies 7, and 7» are probability distributions so that the linear combinations 
Q17 + Ag, are well defined.) Hence, in the case where the Von Neumann setting 
applies, the set II is either a singleton or an infinite set. (The case where II is 
a singleton is a degenerate case where Player(1) has only one strategy, which is 
trivially optimal.’ ) 


Also, in the case where the Von Neumann setting applies, a single probability space 
(Q2,G) can be used to analyze the probabilistic behavior of all the pairs (7,A) of 
algorithm and adversary. This probability space can be chosen to be the product 
space Q = Ix J endowed with its complete o-field G = 2": Q and G are both finite. 


In the general case, we saw in Chapter 2 that the construction of an adequate 
probabilistic structure is more complicated and yields a possibly different space 
(Q,,4,G7,4) for every pair (7,A). Consider the case, where, as when the Von Neu- 
mann setting applies, a single space (2, Z) is used for all the pairs (7,.A). The sample 


’Remember that all this discussion is geared at finding an optimal algorithm! 
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space 2. must contain a different element w for each possible execution i.e., for each 
sequence of actions (act,, act, actz,...) arising from the game played by Player(1) 
and Player(2). 


Assume the interaction between Player(1) and Player(2) modeled to allow Player(2) 
to know the algorithm a under use. Therefore, by assumption, in every execu- 
tion, there must be a move (or a sequence of moves), specific to 7, undertaken by 
Player(1), and informing Player(2) of the strategy 7 chosen. 


Working by contradiction, assume that we could use the Von Neumann setting to 
model the game between Player(1) and Player(2). This means that the set IE can 
be represented as a set of probability distributions on a given finite set J, and that, 
similarly, the set A can be represented as a set of probability distributions on a 
given finite set J (and that the performance f(a, A) is the expected value £, 4[T] 
of a random variable 7). In that case the sample space 2 is equal to J x J and is 
therefore finite. 


On the other hand, as discussed above, in that case, the set II must be infinite. 
As in each execution Player(1) informs Player(2) of its strategy 7, the set of dif- 
ferent executions must therefore also be infinite. This implies that 2 is infinite, a 
contradiction. 


Chapter 7 


An Optimal Randomized 
Algorithm 


In this chapter we consider the scheduling problem studied in Chapter 5 but al- 
low algorithms to use randomness. The terms protocol and algorithm are synony- 
mous but for notational emphasis we favor here the use of protocol: the notations 
Il, 7, P,, Pgenerating will refer to protocols whereas the notations A,A,P,4, Agen- 
erating will refer to adversaries. 


Using the general model presented in Chapter 2 we construct a II/A-structure as- 
sociated to the scheduling problem. This allows us to characterize very precisely 
the optimization problem. We provide a specific randomized protocol and give a 
completely formal proof of its optimality. 


This proof is to our knowledge the first completely formal proof of optimality of 
a randomized algorithm.' This chapter should therefore illuminate the power and 
relevance of the model presented in Chapter 2. 


7.1 The Scheduling Problem 


7.1.1 Description of the problem 


We recall quickly here the setting of the problem. m,n and p are three non-negative 
integers such that 1<m<n<p. 


'The proof given by Graham and Yao in [25] still needs some fine tuning... 
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e pis the total number of processors available. 


e nis the number of processors that are necessary to configure the system: at 
each time n processors are in operation. 


e We assume that a processor can become faulty only when in operation and 
that faults occur one at a time. We also assume that no repairs are available. 
m is the resiliency parameter of the system: the system functions as long as 
the set of n processors selected does not include more then m faulty processors. 
The system crashes as soon as the set of n processors currently in use includes 
at least m+ 1 faulty processors. 


We define the discrete time t of an execution to be the execution point at which the 
t-th fault occurs. In the sequel, we write time instead of discrete time. 


We consider the blind situation where, during an execution, the scheduler is informed 
of the occurrence of a fault whenever one occurs, but does not get any additional 
information about the location of this fault. Upon notification of a fault the sched- 
uler reconfigures the system. We let s, denote the set of n elements selected for 
the first time and, for t > 2, we let s, denote the set of n elements selected after 
report of fault t — 1 (i.e., after time t — 1). We also let f;, (fi € 51), denote the 
location of the first fault and generally we let f;, (f: € s:), denote the location 
of the ¢-th fault. For the sake of modeling we say that the sequence f,, fo,..., is 
decided by an entity called the adversary. The purpose of this work is to find a 
scheduling protocol guarantying the best expected survival time against worst case, 
on-line adversaries. This means that, when selecting the ¢-th location of fault /f,, 
the adversary “knows” the whole past s,, fi, 52, fo,...,8;. We can equivalently say 
that, for each ¢, the adversary “receives the information” of what the choice s, of the 
scheduling protocol is before deciding what the next fault is. Note that, by contrast, 
expressed in this language of on-line information, the assumption that the protocol 
is blind means that, for each t, the scheduling protocol “receives no information” 
about the choices previously made by the adversary before deciding itself what the 
set s, is. We will provide in Section 7.2 a formal setting allowing to interpret these 
notions of knowledge. 


7.1.2 Interpretation using game theory 


The purpose of this section is to present some intuition for the formal model pre- 
sented in Section 7.2. Some notions as the notion of actions, internal and external, 
that we introduce in the course of the discussion are not presented formally but 
should be clear from the context. 
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Following the methodology outlined in Chapters 2 and 6 we describe the scheduling 
problem presented in Section 7.1.1 as a game played between two players Player(1) 
and Player(2). We will refer to this game as the “scheduling game”. In this setting, 
a protocol is a strategy of Player(1) and an adversary is a strategy of Player(2): 
Player(1) is called the protocol-designer and Player(2) is called the adversary- 
designer. The game played by the two players follows the rules of the scheduling 
problem described in Section 7.1.1: Player(1) plays first, chooses s; and informs 
Player(2) of its decision. Player(2) then plays and selects f; in s;. No informa- 
tion is conveyed from Player(2) to Player(1). More generally, in the ¢-th round, 
Player(1) selects a set s; and informs Player(2) of its choice; Player(2) then plays 
and selects an element f; in s;. We adopt the model where Player(2) does not know 
explicitly the strategy 7 followed by Player(1). ? (In the model where Player(2) 
knows explicitly the strategy a followed by Player(1), Player(1) “sends a message” 
informing Player(2) of the strategy selected by Player(1).) 


As discussed in Chapter 6, Page 134, for every protocol a and for every adver- 
sary A the sample space 2 must contain a different w for each possible execution 
ie., for each sequence of actions (act, act,, actz,...) undertaken in the game played 
by Player(1) and Player(2) when following the rules of the scheduling game. Some 
care has to be devoted to characterize the actions that we here consider. A specific 
protocol (or adversary) can be implemented in various ways, each of them having 
specific internal actions. Nevertheless, internal actions are irrelevant for the per- 
formance analysis of a protocol: the performance analysis of a protocol is solely 
measured in terms of its external actions, i.e., the specific actions it undertakes as 
prescribed by the rules of the game. In a figurative sense, we treat a protocol (resp. 
an adversary) as a black box and only analyze its external actions. 


In our scheduling game and in the model where Player(2) does not know the strategy 
m followed by Player(1), the external actions undertaken by Player(1) are the suc- 
cessive choices of a set s; and communications to Player(2) of the choice last made. 
To simplify the discussion we will omit explicit reference of the communication be- 
tween Player(1) and Player(2) and implicitly assume that this communication is 
systematically (and instantaneously) performed at each selection of a set s,. Simi- 
larly, the external actions undertaken by Player(2) are the successive choices of an 
element f, in the set s, last selected by Player(1). To simplify further the discus- 
sion we will abuse language and speak of the actions s, and f; in place of “choice 
of s,” and “choice of f;”. The assumption m < p clearly implies that the system 
cannot survive more then p faults. We can therefore restrict our analysis to the 


?See Section 8.3 and Chapter 6 for a presentation of the notions of explicit /implicit knowledge. 
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times t = 1,...,p. From this discussion we deduce that the sample space 


Q 2 (31, fi,- . 5 8p5fp)3 Sy € Pip), fi € St, 1 < t < p} 
is big enough for the probabilistic analysis. 


This definition is in conformity with the general construction given on page 41 of of 
Chapter 2. For every protocol 7 and every adversary A we defined there 


Q,.4 = {w;w is a (7, A)-execution}, 


where 
w= ay (81,21, 1) ag (52,22, Yo) a3 (53, U3, Ys) tee 
ee eee 
Player(2) Player(1) Player(2) 


The discussion given on page 42 shows that, for every t, (s;, 24, yz) is a deterministic 
function of a,,...,a, so that, from a probabilistic point of view, an execution can 
equivalently be defined to be the sequence a,,a2,... This is the definition adopted 
in this chapter. 


We have so far informally defined protocols and adversaries to be the strategies of 
Player(1) and Player(2), respectively. We now discuss how these notions can be 
formalized, beginning with the notion of adversary. Our construction is a direct 
application of the general construction given in Chapter 2. 


We define an adversary to be a family of probability distributions (Q,)vev, one 
for each v in V: V is the set of all the possible views that Player(2) can have of 
the system at any time of the game i.e., at any time of the execution. (We will 
make this more explicit in Section 7.2.) For every element f, the quantity Q,(f) 
represents the probability that Player(2) chooses f if its view of the system is v. 
Note that, according to the general presentation made in Chapter 2, we should define 
an adversary to be a family of probability spaces (Q.,G%,P.)uev. Nevertheless we 
can take all the measurable spaces (Q,,G,) to be equal to ({1,...,N}, 2b), 
This allows us to omit mentioning (Q,,G,) in the definition of the adversary. 


Note that a family (fi )uev, ie., the choice of an element f, for each view v in V, 
corresponds to a decision tree of Player(2).° As the number of rounds of a game is 
bounded and as, at each round, the number of different actions of both players is 
also bounded, the number of decision trees of Player(2) is similarly bounded. In this 
case it is easy to see that the set of strategies of Player(2), i.e., the set of families 


° Actually a decision tree corresponds to a “weeded out” family (fv)oev, where V’ is a maximal 
subset of V having the property that each view v in V’ is compatible with the choices f,, made 
previously by the player. Nevertheless, the extension to the set of all views is inconsequential and 
we adopt for simplification this characterization of a decision tree. 
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(Qv)vev, is in one-to-one correspondence with the set of probability distributions 
on decision trees. 


We could therefore equivalently define an adversary to be a probability distribution 
on the set of decision trees (of Player(2)). We say that the definition in terms of a 
family (Q,)uev adopts the local point of view whereas the definition in terms of a 
decision tree adopts the global point of view. Let us emphasize that the equivalence 
of these two points of view depends on the finiteness of the number of decision trees.* 


Following the same model for Player(1) we could define a protocol to be a family 
(Pu Jueu of probability distributions, one for each possible view u that Player(1) can 
have of the system at any time of the game.° Nevertheless, as by assumption the 
protocol receives no information from the adversary, we find it easier to adapt the 
global point of view: in this case a decision tree (of Player(1)) is simply a sequence 
(s1,...,5,) in P,(p). We therefore define a protocol to be a probability distribution 
on P,,(p). 


Note that the distinction between protocol and protocol-designer (resp. between 
adversary and adversary-designer) is often not kept and we refer to properties of 
the protocol (resp. the adversary) that should be more properly attributed to the 
protocol-designer (resp. the adversary-designer). A case where both points of views 
are equally valid is when we refer to the decisions done by the protocol or by the 
protocol-designer: the protocol is the strategy used by the protocol-designer for its 
decision making. By contrast, when speaking of “the knowledge held by the adver- 
sary” or of “the information received by the adversary” we should more correctly 
speak of the knowledge held by the adversary-designer: by definition, an adversary 
A is a family (Q.), of probability distributions which receives no information. On 
the other hand, Player(2), the adversary-designer, does receive some information 
during an execution and uses this information as prescribed by its strategy A. 


“This duality is well known, but not everyone realizes the caveat about finiteness. For instance 
Hart et al. say in [29] and we quote: “There are two main ways of introducing randomizations ... 
The first consists of a random choice of the next process at each node of the execution tree ... The 
second way consists of taking a probability distribution over the set of deterministic [executions] 
(t.e., make all the random decisions a priori.) ... It is easy to see that the first case (independent 
randomization at each decision node) is a spectal case of the second one (by doing all randomizations 
at the start!)” Note though that, in the case of infinite executions, it is not trivial to convert “the 
first way” into “the second way”. This is actually the heart of the problem in the construction of 
the probability distribution P, given on page 41. 

°As previously for the adversary, note that, according to the general presentation made in 
Chapter 2, we should actually define a protocol to be a family of probability spaces (Qu, Gu, Pu)uev- 
Nevertheless we can take all the measurable spaces (Qu,Gu) to be equal to (Pn(p), QPn(Pisy, This 
allows us to omit mentioning (Qu, Gu) in the definition of the adversary. 
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As is established in Chapter 2, a given strategy a of Player(1) and a given strategy A 
of Player(2) define a unique probability distribution P, 4 on the set Q of executions. 
We will give a precise characterization of P,4 in Section 7.2. We will also define 
there formally the random variable 7 representing the survival time. With these 
notions, the optimal expected survival time achievable against every adversary is 


supinf B, 4(T]. 
tr OA 


Note that we adopt here the Von Neumann setting described in Chapter 6: for each 
of the two players the set of pure strategies is the finite set of decision trees of that 
player. Hence, by Von Neumann’s theorem, (see Theorem 8.3.2), 


sup inf EF, 4[T] = inf sup £, 4[T], 
cr oA A oT 


and the general proof methodology described in Chapter 6, Page 127, applies. Our 
proof will follow this methodology. 


The rest of the chapter is organized as follows. In Section 7.2 we formalize the 
previous discussion and construct the probabilistic model used in the subsequent 
sections. In Section 7.3 we define for every 7 and A two pseudo-probability dis- 
tributions P, and P,4 which play a crucial role in the proof. (The denomination 
“pseudo-probability distribution” refers to the fact that P, and P,4 are not prob- 
ability distributions but that, as is asserted in Lemma 7.3.4, some “conditional” 
variants of them are well defined probability distributions.) Section 7.4 describes 
a class Prot(Prog,) of protocols 79 and an adversary Ao. The main result of this 
chapter which is presented in Theorem 7.4.6 asserts that 7 and Ag verify point 2 of 
the methodology given in Page 127, and hence that every protocol 7» € Prot(Prog,) 
is optimal: 
sup Bx Aol L] = Exp Aol] = inf Exo all]. 


The proof of this theorem is the object of the rest of the chapter. Section 7.5 presents 
some random variables that are fundamental for the proof. Section 7.6 establishes 
that sup, F,.4,[f] = Fxo,a,[L]. Similarly, Section 7.7 establishes in essence that 
inf Ey,,a[T] = Fro aolZ]. 
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7.2. The Probabilistic Model 


We formalize here 1) the notions of protocol and adversary and 2) the probability 
spaces that will be used in the subsequent analyses. Recall that a protocol is a 
strategy of Player(1) and that, similarly, an adversary is a strategy of Player(2). 


For every t,1 <¢t<p,asequence o = (51, 59,..., +) in P,(p) is called a t-schedule 
and a sequence ¢ = (fi,...,f,) of elements of [p] is called a t-fault-sequence. A 
t-fault-sequence @ is adapted to a schedule o if f; € s; for all 7,1 <7 < t, and if, for 
all j, the condition s; Z {fi,..., fj-1} implies that f; € s; — {fi,..., fj-1}.- 


A t-execution is an alternating sequence w = (51, fi, 82, fo,..-, 51, f¢) obtained from 
a t-schedule (s,,...,5;) and a t-fault-sequence (f;,..., f;) adapted to (s),...,5;). A 
t-odd execution is a sequence w = (81, f1, 52, fo,..., 5) obtained from a t-schedule 


(s1,...,5;) and a ¢ — 1-fault-sequence (fi,..., f:_1) adapted to (s1,...,5;_1). For 
simplicity, we use the term execution in place of p-execution. 


We define the sample space Q to be the set of all executions: 2 © {w; w execution}. 
We endow 2) with its discrete o-field G def 9 

We now define various random variable on (Q,G). Throughout, random variables 
are denoted by upper-case and their realizations by lower-case. For all ¢#,1 < 
t < p, S; and F;, are defined by S,(w) = 5, and Fj(w) = f;. This allows us to 
define the derived random variables S; = (91,...,5:), Fi = (F1,...,F) and & = 
(51, iy... F1,S:). Si, F:, and & are respectively the random t-schedule, the 
random t-fault sequence and the random t-odd-execution produced up to time t. 


We also let G, s o (51. Fi, . 51, Fi) and GF s o (51. Fi, . S41) be the o-fields 


of events “happening no later” then the selection of f; and s;,,, respectively. 


We now define protocols and adversaries. As recalled in the previous section, the 
protocols and the adversaries are the strategies of Player(1) and Player(2) taking 
random steps in turn and sending information to the other player at the end of each 
step. The probability distribution used for this step depends on the view of the 
system held by the player. In our specific problem, Player(2) is informed of all the 
moves of Player(1), (but nothing else about the protocol a selected by Player(1)), 
whereas Player(1) learns nothing from and about Player(2). We adopt the local 
point of view to describe an adversary and the global point of view to describe a 
protocol.® 


Hence, an adversary A is a family of probability distributions (Q,),ev on [p], one 
for each t,1 < t < p and each t-odd-execution v = (51, fi,..., ft-1, 5+) and such 


°See page 139 for a definition of the local and of the global point of views. 
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that, with Q,-probability one, (fi,..., f;) is adapted to (s1,..., 54). 
A scheduling protocol x is a probability distribution on P,(p)”. 


As is established in Section 2.4, there is a unique and well defined probability dis- 
tribution P, 4 induced on the set of executions by a given protocol 7 and a given 
adversary A. For the sake of illustration, we give an explicit characterization of P, 4 
in the next proposition. 


Proposition 7.2.1 Let t and A = (Qy)vev be a protocol and an adversary as 
defined above. Then there is a unique and well defined probability distribution Pz 4 
on (Q,G) satisfying the following two properties: 


i P,.a|Sp = | =. 


ii For every t,1 < t < p, every t-execution v = (81, f1,.--, fr-1, 5+), and every 


(p—t)-schedule (5141,...,5,) such that 7(51,...,5,) > 0 and Q,,(fi)Qsifis.( fe) 
Qs: fiss..s11(fi-1) > 0, we have: 


P,alF = | &; =v, (Sepis-- 25 5p) = (Sie 1 59)| =Q, . 


Property i formalizes the fact that the protocol receives no on-line information 
and makes its decisions in isolation. Property ii formalizes the fact that the ad- 
versary is on-line and selects F;, based on the sole knowledge of the past odd- 


execution €;, independently of the schedule (5;41,...,.5,) selected for subsequent 
times. Using Convention 8.1.1 we extend the definition of the conditional probabil- 
ity in 11 and set P, 4 F =: | Ey =v, (St41,---5 5p) = (Sty. - ..5»)| = 0 whenever 


Qs: (fiQ si tiss( fe )Q si fisa.sy1(fr-1) = 0. 


Proor. Let w = (51, fi, 2, fo,..-,Sp, f,) be a generic execution in Q and, for 
every ¢, let w, = (51, f1,...,5:) be the associated t-odd-execution. By successive 
conditioning we can write 


Pyalw) = Pza[Sp = (15---55))] Peal Pi = fi |r = 01, S2y = (52-065 8))] 


Pra [Foot = Spt | pt = p-1 Sp = 8p] Pra [Fp = fy | & = %>] 


so that we see that the conditions i and ii imply that, if it exists, P, 4(w) must be 
equal to 
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We easily check that P, 4 thus defined is additive and that P, 4(Q) = 1. As G is 
finite, the o-additivity of P, 4 holds trivially. Hence P, 4 is a well defined probability 


measure. 


We have therefore defined the family of probability distributions (P,.4),,.4 on the 
same space (Q2,G). In situations different from our scheduling problem, such a 
modeling is in general not possible and a different probability space (Q7.4,G;7,4) 
must be associated to each couple (7,.A). Also, in the case of infinite executions, 
the construction of a measure P,,, is a non-trivial probability problem requiring the 
use of an extension theorem (e.g., Kolmogorov’s extension theorem). We presented 
the general construction in Section 2.4 of Chapter 2 when the the coins of both 
players have at most countably many outcomes. 


We can now set up formally the optimization problem presented in Section 7.1. 
The survival time is defined to be the random variable T = max{t; Vu < t,|Syu 9 
{Fi,...,fu}| < m}. For every protocol 7, let ¢(7) = infy E,,4[7] be the worst case 
performance of 7. We let 


top, = sup t(a) = supinf £, [7]. 
ig x A 
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7.3. Some Specific Probability Results 


7.3.1 A formal interpretation of the on-line knowledge of the ad- 
versary 


We begin this section by presenting a lemma expressing that, for every protocol and 
every adversary, conditioned on the past, the selection F, made by the adversary at 
time ¢ is independent of the choices 5,4;,...,.5, made for ensuing times by the pro- 
tocol. This shows that the definitions of a protocol and of an adversary given in the 
previous section formalize accurately the on-line nature of the information received 
by the adversary: eventhough a protocol decides the whole schedule s;,...,5, at 
the beginning of the execution, the adversary does not get to see each decision s; 
before time t. (As mentioned in Section 7.1.2, Page 139, a more correct statement 
is “the adversary-designer does not get to see each decision s,; before time t”.) 


Lemma 7.3.1 Leta be a protocol, A be an adversary, t,t <p—1, be a time and v 
be a t-odd-execution such that P, a[E&; = v] > 0. Then the random variables F, and 
Siti p are independent with respect to the measure P,.4 when conditioned on €, = v. 


Proor. This is a direct consequence of Proposition 7.2.1: by Condition bf ii, for 
every p—(t + 1)-schedule g, for all v and a, 


PralP=-|& = 0. Sip =| =PralFe=-|& =e). 


This expresses exactly the independence of Fy and S;4;,, conditioned on & = v. 


7.3.2 The Notations P, and Py, 


The next two definitions Definition 7.3.1 and Definition 7.3.2 introduce the family 
of protocols that select with non-zero probability a given ¢-schedule, and the family 
of adversaries that select with non-zero probability a given t-fault sequence. The as- 
sociated lemmas, Lemma 7.3.2 and Lemma 7.3.3 introduce and justify the notations 
P, and P,. These two lemmas will be fundamental in our proofs for the following 
reasons. We will introduce one (family of) algorithm > and an adversary Ap. In 
one part of the proof we will establish that m> is optimal against Aj. Throughout 
this part we will consider only the adversary Aj. Lemma 7.3.2 will allow us to 
consider the unique expression P4, instead of the family (P,.4,)ren- This will make 
the analysis much simpler as the optimization over 7 will not involve the probability 
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measure. Symmetrically, in the second part of the proof we will in essence estab- 
lish that Ap is optimal against 7 . In this part we will need to consider only the 
expression P,, instead of the family (P,,.4)sea- 


Definition 7.3.1 Let (s,,...,5;) be some t-schedule. Then 


Pgenerating(s1,..., 84) <= {r; A(S:41,-+-, Sp), T[(S1,---, 8) ] > o} . 


As just mentioned, the next Lemma provides the fundamental technical tool that will 
allow us to handle conveniently all the probabilistic expressions required in the proof 
of Lemma 7.4.7 — the proof that 7 is optimal against Ay.’ Similarly, Lemma 7.3.3 
will provide the probabilistic tool required in the proof of Lemma 7.4.8 — the proof 
that Ao is optimal against 7. 


Lemma 7.3.2 Lett, 1<t< N and let o = (s,,...,5;) be at-schedule. Let ® € G. 
Then, for every adversary A, 


P,al®|S, = 0] 
is independent of the protocol 7 € Pgenerating(o). We let 
P,z[®|S, =o] 


denote this common value. 
Following Convention 8.1.1, P, 4 [® | S; = a] is set to zero if  ¢ Pgenerating(c). 


Proor. Let z be any protocol in Pgenerating(o) and let A = (Qy)vev be a given 
adversary. Recall that, by definition, (see page 141), G@ = o(5),41,...,51, £4). 
Thus, an event ® is in G, if and only if there exists a boolean random variable ¢ 
depending on w only through the random variables ($1, F\,..., 51, F:), (ie., o(w) = 
w(S)(w),..., F:(w)) for some real random variable w), such that ® = {w; d(w) = 1}. 
We have: 


P,,a[® |S: =o] 
= Ln E | S; = o| 


*Both wo and Ao are defined in Section 7.4. 
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This last expression is independent of the protocol 7, as needed. 


The following definition describes the set of adversaries that generate with non zero 
probability a given fault sequence. Note that, by Lemma 7.3.2, PalF, = |S; = 0 
is itself well-defined. 


Definition 7.3.2 Leto be a t-schedule, and ¢ a t-fault sequence adapted toa. Then 
Agenerating,(@) = { A; P4l|F,=¢ | & =o] > o} . 


Lemma 7.3.3 Lett, 1 <t< p, leto be at-schedule, @ be at-fault sequence adapted 
too and Let ® be an event in G. Then, for every protocol x € Pgenerating(c), the 
EXPTESSION 

Pr al[® |S: = 0,F; = 9] 
is independent of the adversary A € Agenerating,(¢). We let P,[® | S; = 0, F; = ¢] 
denote this common value. 


Proor. The proof closely follows that of Lemma 7.3.2. We write o = (51,...,5:) 
and @ = (fi,...,f;). By assumption, there exists a boolean random variable ¢ de- 
pending on w only through the random variables (9), Fi,..., 94, Fi, St41), (Le, o(w) = 
WS, Fy... 54, Fe, St41)(w)) for some real random variable yw), such that @ = 
{w ; o(w) = 1}. Let a be a protocol in Pgenerating(o) and A = (Qy)vev be 
an adversary in Agenerating,(¢). We let P,[Si41 = Si41 | S; = 0] denote the con- 
ditional probability (3°, 7[(o2, 5141, @)])/(924 t[(o%, @’)]) where the first summation 
is over all p— (¢+ 1)-schedules a and the second over all p — t-schedules a’. Then: 


P,a[®|S,=0,F, = 6] 
E,alo|S:=0,F: = 6] 
So Pr [St41 = S141 |S =o] W(S1, fis e+ +5 Str fry Sit) - 


St41 


This last expression is independent of the adversary A used and this concludes the 


proof. 
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Lemma 7.3.4 Lett be a time, 1<t< p. Leto be a t-schedule and A be an adver- 
sary. Then Pal - |S; = 0] is a probability distribution on (Q,G,). Similarly, if o is 
a t-schedule, @ a t-fault sequence adapted to 0 and x a protocol in Pgenerating(c), 
then P,[-| S; = 0, F; = | ts a probability distribution on (Q, Gi). 


Proor. Pal: |S; =o] is defined on (Q,G,) to be equal to P, yl - | S; = o] for 
any 7 € Pgenerating(o) and is therefore a well defined probability distribution (on 
(Q2,G,)). Similarly, P,[ - | S; = 0, F,; = @] is defined on (2,G/) to be equal to 
Pal: |S: = 0, F, = @| for any A € Agenerating,(¢) and hence is a well defined 
probability distribution on (Q,G/). 


Note: In spite of its apparent simplicity, Lemma 7.3.2 answers a subtle point il- 
lustrating the difference between implicit and explicit knowledge that we quickly 
recall.8 


In order to compute the optimal survival time t,,, = sup, infy E,,4[T] we are led 
to consider the performance values t(7) = inf, F, 4[T] associated to all protocols 
mz. In the previous formula the infimum is taken over all adversaries for a given 
«. A common interpretation of this fact is that the optimal adversary “knows” 
the protocol: this consideration entitled us to assume an off-line adversary in the 
deterministic case. Hence, such an adversary is provided with: 


e the on-line information of the past schedule. At every time ¢, we can picture 
that an explicit message is relayed to the adversary to inform it of the set s, 
last selected by the protocol. 


e the off-line information of the protocol 7 that A is associated with: this infor- 
mation is implicitly provided to an optimal adversary. 


These two notions of knowledge are very different, and Lemma 7.3.2 would not hold 
if we assumed that the adversary was provided with the explicit knowledge of a 
and was able to use this information in the selection of the elements F),..., F,. 
For instance, consider the case where p = 3,n = 2 and m = 1. Consider a greedy 
adversary A, selecting at each time t a processor F; so as to maximize the probability 
that F, is in S;4,. Assume that s; = {1,2}. Consider two different protocols 7, and 
mT. Assume that protocol 7, always selects s. = {1,3} whereas m2 always selects 
$2 = {2,3}. If A knows the protocol it is associated with, A selects F, = 1 with 
probability one when associated with 7,, and selects F, = 2 with probability one 


®These notions are presented in Page 207. 
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when associated with a2. Hence, in this case, the probability 
Pra =1) 5, = {1,2}] 


is not independent of the protocol 7 considered, even though the event {fF = 1} is 
clearly in the o-field G,. 


Hence Lemma 7.3.2 would not be true, had we assumed, as in [25], that the ad- 
versary “knew” the protocol. Recall that, as is argued in Page 131, this change of 
model only affects the way the adversary is defined and interacts with the proto- 
col. In particular, it does not affect the class of optimal protocols. Nevertheless, as 
Lemma 7.3.2 is crucial for the proofs given in Section 7.6, our proof of optimality 
would not carry over in the Strong Byzantine setting. 


7.3.3 Applications of the definition of P, and of P, 


Lemma 7.3.5 For all j and t, 7 < t, all t-schedules o and all adversaries A, 
PalF; € - | S, = o] and P4alT > t | S& = o| are well defined probabilities. 
Similarly, if o is a t-schedule in Ni, & a t-fault-sequence adapted to o and x a 
protocol in Pgenerating(o), then P,[T >t+1 | S,=0,F, =] is a well defined 
probability. 


Proor. The random variable F; is clearly G,-measurable for all 7,1 < 7 < t (by 
definition of G,!). Hence the first result is a simple application of Lemma 7.3.2. On 
the other hand we can write 


{T>o {Fis} <m} 


I I 
~ 
—_~—— J 
ae ae 
=) =) 


{Fi,..-, Ff <m-1}. 


For all j, the event {|5; V{Fi,.--,Fj)-1}| < m— 1} is clearly in G/_, C Gi_, CG. 
Hence {T > t} is also in G, and Lemma 7.3.2 again shows that P,4|T >t | S,=0] 
is a well defined probability. Similarly {7 > t+ 1} is in G so that, by Lemma 7.3.3, 


P,[T>t+1]|S,=0, F, =¢] is a well defined probability. 


The following lemma expresses that, conditioned on S;_; = 0, the events S$, = s 
and T’ >t — 1 are independent with respect to the measure Py, (i.e. with respect to 
any measure P, 4). 
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Lemma 7.3.6 for every t,2 <t< N, every t — 1-schedule o and every s € Pn(p), 
Pal[f >t-1|S=(0,s)) = Pall >t-1|S4=0]. 


Proor. Note first that, by Lemma 7.3.5, the quantities involved in the previous 
equality are well defined. 


P,[T >t-1|S 150,555] 


Pal (Sun {ives Pead =O} | Sass. <5] 
ual 


Pal (sun {Fie Foa}=0} | Sasosi=s]. 
ual 


By Lemma 7.3.1, conditioned on S;_; = o, the random variables (F1,...,F:-1) 
and S$, are independent. Hence, conditioned on S;_; = 0, the events Nz, {su 
{F,,...,Fu-1} =@} and (5, = s} are similarly independent so that 


Pal (suo (Fas Fia} =} | Sass =5] 
ual 


= Pal fours Rab =0} | Sao]. 


This establishes our claim. 


The following definition characterizes the schedules o which allow the system to 
survive with non-zero probability. 


Definition 7.3.3 Leto be at-schedule such that sup, P,,4|S; = 0,T >t) > 0. We 
then say that o is an A-feasible t-schedule and we denote this by: 


o € Feasy,. 


Remarks: 


1. Using Convention 8.1.1 we see that o € Feas, if and only if sup, P, 4[T >t | 
S,=0]>Oie., if Pylf >t|S,=o]>0. 


2. We will provide in Corollary (7.6.2) a pure combinatorial characterization of 
Feas,y, for the adversary Ay defined in Definition 7.4.1, page 163. 
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7.4 Description of a Randomized Scheduling Algorithm 


For the rest of this chapter we restrict ourselves to the case m = 1, i.e., when the 
system can sustain one fault but crashes as soon as at least two of the n processes 
are faulty. We provide a family of protocols and prove their optimality. 


7.4.1 Description of the program 


We formally defined a protocol to be a probability distribution on P,,(p)’. In place of 
a single protocol 7) we present here a family Prog, of programs not only outputting 
random schedules (5),...,.5,) in P,(p)’ but making also other, internal, random 
draws. For the sake of clarity we distinguish between a program and the protocol 
associated to it, i.e., between a program and the probability distribution of the 
random schedule (5;,...,.$,) that it generates. For every program prog in Prog, we 
prog denote the associated protocol. We also let Prot(Prog,) denote the family 
of protocols derived from Prog: 


let «7 


Prot(Progy) = {7 prog} prog € Prog}. 


In the code describing Prog, we use statements of the kind X := uniform(a; A) and 
X := arbitrary(a;A). We call these statements randomized invocations. For 
every set A and integer a,a < |A|, the statement XY := uniform(a; A) means that 
the set X is chosen uniformly at random from P,(A) i.e., from among all subsets of 
A of size a. Similarly, for every set A and integer a,a < |A|, the statement X := 
arbitrary(a;A) means that the set X is chosen at random — but not necessarily 
uniformly — from P,(A). The probability distribution used for this random selection 
is arbitrary and depends on the values returned on the past previous randomized 
invocations done by the program. This means that, for every t, the probability 
distribution used for the ¢ + 1-st invocation can be written P,,,, where 71,...,7% 
are the ¢ values returned by the first ¢ randomized invocations. 


Prog, represents a family of programs, one for each choice of the probability distribu- 
tions P,,,, used at all the randomized invocations. (Recall though, by definition, 
if the ¢ + 1-st randomized invocation is X := uniform(a; A) then P,,., = Up, (A) 
for all ry,...,7.) We will not make these choices explicit and will show that all 
programs prog in Prog, are optimal. 


We present the code describing Prog, in the next figure and provide explanations 
after it. 


7.4. Description of a Randomized Scheduling Algorithm 


06. 


Prog, 


Variables: 
Co © [p]; initially [p] 
C; C [p], jg = 1,...,p; initially arbitrary 
5S; € Pr(p), 9 = 1,...,p; initially arbitrary 
S € P,(p); initially arbitrary 
I, I’, J C [p]; initially arbitrary 
Ke € [p]; initially arbitrary 
a €N; initially n 


Code: 
for ¢=1,...,|p/n| do: 


S,:= arbitrary(n; Co) 
Co = Co — SY 
C,:= Si, 
for t= |p/n|+1,....p—n+l1do: 
if t = |p/n| +1 then: 
S,:= Co 
Co i= 
else 5, := 0) 


a 
I := arbitrary(p—n— |=" |(t—- 1);[t- 1) 
J:=1; 0 :=[t-1j)-T 
while 4 (do: 

Kk := arbitrary(1;/) 


S := uniform(a + 1;Cx) 
S,:= S,U(Ck — S$) 
Cr:= S$ 

T= 1 -{K} 


while I’ 49 do: 
kt := arbitrary(1; 1’) 
S$ := uniform(a; Cx) 


St = S,U(CK —S) 
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24, Ceri= S$ 

25. l= T'—{K} 

26. C,:= Sy 

27. if |=" | =a then : 

28. S$ := uniform(a + 1; C;_1) 
29. Sp:=Cy1-S 

30. Cyii= S$ 

31. I’ := arbitrary(a +1;JU {t— 1}) 
32. J:=(JUf{t-1})-L 
33. while /’ 4 9 do: 

34. kt := arbitrary(1; 1’) 
35. S$ := uniform(a; Cx) 
36. S,:= S,U(Ck — S$) 
37. Ceri= S$ 

38. P= T—{K} 

39. Chis SY 


7.4.2 A presentation of the ideas underlying Prog 


We begin by presenting the purpose of the program variables used in Prog). At the 
end of each round #: 


1. S; is the set of n elements selected by the protocol for round ¢. 


2. Co represents the set of elements of [p] that have not been used in the first ¢ 
rounds. 


3. Forevery j,1 <j < t, C; is the set of elements selected at time 7 —i.e., elements 
in S; — and which have never been used in later rounds 7 + 1,7 + 2,...,¢— 
i.e., which are in the complement ($)41; U...US;)° of $j41U...US;. 


For reasons presented below, at the end of each round t,t > [4] + 1, and for every 
jl<j<t—1, C; is either of size [=*| or of size [E*| +1. To achieve this, at 
each round t,¢ > [£|+1, the program variable a is re-initialized to this value [+]. 
The program variables /, /’ and J are used to distinguish and manipulate the two 
sets {7 € [1,t—1];|C;| = a} and {7 € [1,¢-1];|Cj;| = a+1}. The program variable 


K is used to make some non-deterministic choices in the course of the execution. 
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The following explanations will explain the code and bring some intuition behind 
the choice of these numbers. 


The idea of the code is to select for each round ¢ the set 5; in a greedy fashion so 
as to optimize the probability of surviving one more round, if faulty processors are 
selected uniformly at random. For each t < [*], the set Co of fresh elements has 
size at least n at the beginning of round ¢ and it is therefore possible to select 5; 
as a subset of Co. This is accomplished in lines 01 through 04 of the code. As only 
so far unselected elements are selected at each round t, we have C; = 5; for every 


l<j<t< [4 


In round [£|+1 the set Co is possibly non-empty at the beginning of the round, but 
holds less then n elements. We select all its elements and allocate them to S,2)41. 
This is done in lines 06 through 08 of the code. From that point on, i.e., for the 
completion of the selection of the set 5,2|4, as well as for the selection of later sets 
5;, we have to select elements that have been selected previously at least once. We 
adopt the following simple strategy: we select elements from the sets Cj,1 <j <¢, 
that have the biggest size until n elements have been selected. For every j,1 <j < ¢, 
by definition of C; — as the set of elements selected in round 7 but never selected 
afterwards — every element of C; selected in round t must be removed from C; during 
this same round. Hence the strategy consists in transferring into S; n elements from 
the sets C; that have the biggest size. Once this transfer is accomplished 5, is of 
size n and we initialize C; to be equal to 5}. 


At the point in round [2|+1 when Cy becomes finally empty, (in line 08), and when 
the transfer strategy begins to be implemented, the sets Ci,...,C 2; are all of size 
n. We can picture transferring elements away from sets C; of biggest size as the 
selective flow of resources away from big reservoirs: by doing so we maintain to parity 
the level of all the reservoirs. In our case, as we are transferring discrete elements in 
place of a fluid, the transfer strategy keeps the size of the sets C; different by at most 
one. Another consequence of the fact that elements are transferred from the sets C; 
into 5; is that, at the end of each round t,¢ > [4£|+1, the sets C),...,Ci1, 5; area 
partition of the set [p] of all elements. Hence we are in a situation where p elements 
are partitioned into ¢ sets: on the one hand, a set S$; of n elements and, on the other 
hand, t—1 sets whose size differ by at most one. A computation (see Lemma 7.4.3) 
shows that these sets must be either of size [=| or of size |=" | + 1 and that the 
number of sets of size [=*|+1 is p—n—|=|(t—1). The idea in the code is to use 
these numbers and modify for every round ¢ the sets C,,...,C;~2, S;_1, determined 
at the end of each round ¢— 1 to produce the partition Cy,...,C;_1, 5; that must 


result in round f. 
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By our previous argument, at the end of each round ¢ — 1, the sets C,,...,Cy_2 are 
all of size at least je | and $,_, = C,_, is of size n. At the end of each round f, 


these ¢ — 1 sets must be modified and reduced to size [=*| or [= "| + 1. We need 


to distinguish two cases according to whether |4="| is less or equal to [F2} |. In 
the first case the code branches according to line 10, and lines 11 through 26 are 


followed. In the second case the code branches according to line 27, and lines 28 


through 39 are followed. 
Consider the case where |"=*| < |} ]. In this case, every set C;,1 <j <t-1, 


existing at the end of round t—1 is of size at least |=} | > [4_¢|+1 and hence is big 


enough to be reduced to either one of the two allowable sizes ([2=*| or [=| + 1) 


at the end of round ¢. Consider now the case where [=| = |]. In this case the 


sets Cj),1 <7 <¢— 1, which are of the smaller size [5 at the end of round ¢ — 1 


cannot be reduced to the size [=| +1 (= |£¢]| + 1) in the next round: only sets 
being of the bigger size |2=}| + 1 at the end of round ¢ — 1 can give rise to sets of 


the bigger size |=" | +1 at the end of round t. 
In the first case, we branch according to line 10 and select in line 12 an arbitrary 
subset J of {1,...,¢ — 1} of size p—n — [*|(t— 1): this set is the set of indices 
describing which of the sets C,,...,C;_, will be the bigger sets at the end of round 
t. We keep in the variable J a record of this selection (see line 13). Then, in lines 15 
through 19 and 21 through 25 we transfer elements from the sets C; into S;, leaving 
the sets C;,1 <j <¢ — 1, in their pre-decided size. We finally initialize C; to the 
value 5; once the selection of 5; is finished (see line 26). Let us emphasize that, at 


the end of the round, J records the identity of the bigger sets C;. 


In the second case we branch according to line 27. In this case, at the beginning of 
round t, the sets C),..., C2 are all of one of the two sizes |=" |+1 and |=*], and 
J records the identity of the bigger sets C;,1 <7 <¢-— 2. Also, at this point, Cy_, 
is of size n. As all the sets C),...,C_-1 must be reduced to size at most |[=*| +1, 
we first transfer n — ([2£*| +1) elements from C;_, to 5; (see lines 28 through 30). 


We finish the selection of 5; by selecting one element from each of |[=*| + 1 sets 


arbitrarily selected from J U {t — 1}. (The selection of the [=¢] +1 sets is done 
in line 31. The transfer of the elements is done in lines 34 through 38.) As in the 
previous case, we update J so as to record the identity of the bigger sets Cj at the 


end of the round (see line 32) and initialize C;, with the value 5; (see line 39). 
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We quickly discuss the choice of the probability distributions used in Prog. As 
mentioned at the very beginning of our explanations, the idea of the code is to select 
for each round ¢ the set 5; in a greedy fashion so as to optimize the probability of 
surviving one more round, if faulty processors are selected uniformly at random. 
Let us call random adversary the adversary that selects the faulty processors in 
this fashion. The main idea of the proof will be to prove that any a in Prog, is 
optimal against the random adversary and then to prove that, against any such 
mq, no adversary can do better then the random adversary. As we will see, we 
could replace all the uniform randomized invocations by arbitrary randomized 
invocations and still obtain optimal protocols against the random adversary. The 
reason is, (as we will see), that for every time t, the fact that the system is still 
alive after the occurrence of the ¢-th fault means exactly that, for every j,7 <¢t—1, 
the j-th fault is in the set C;. (This is where the condition m = 1 plays its role.) 
Furthermore, if the system is still alive after the occurrence of the ¢-th fault and if 
the adversary is the random adversary, all the elements of C; are equally likely to be 
faulty. This is due to the nature of the random adversary which, by definition, makes 
its selections irrespectively of the identity of the elements chosen by the protocol. 
Hence, for every NV, if in round t+ 1 the protocol is to select a given number N of 
elements from one of the sets C;, all the (1!) choices are equally as good, as any 


N elements of C; have the same probability of all being not faulty. 


But this does not hold for a general adversary. In effect, a general adversary can 
differentiate between different elements of a given C; when the protocol uses arbi- 
trary distributions to make such selections. We show that the strategy according 
to which, whenever selecting elements from a given set C;, the protocol always uses 
the uniform distribution, disallows the adversary such capabilities of differentiation. 
This means that the use of uniform randomized invocations reduces the power of 
any adversary to the one of the random adversary. 


We use arbitrary randomized invocations to emphasize that for the other random- 
ized invocations made by the protocol, the choice of the distribution is irrelevant 
for the effectiveness of the protocol. For the simplicity of the exposition we let the 
protocol make these choices. But we could easily extend our results and prove that 
the programs in Prog, would perform equally well if these choices were made by the 
adversary. 
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7.4.3 Invariants 


For ¢ = 1,...,|p/n| and for every execution of a program in Prog,, we define round 
t to be the part of the execution corresponding to the ¢-th loop, between line 02 
and line 04 of the program. Similarly, for ¢ = |p/n|,...,p— + 1 and for every 
execution of a program in Prog, we define round t to be the part of the execution 
corresponding to the ¢-th loop, between line 06 and line 39 of the program. 


Lemma 7.4.1 Prog, satisfies the following invariants and properties valid at the 
end of every roundt,1<t<n—pH+l. 


a. All invocations in round t of the commands arbitrary or uniform are licit, 
i.e., a < |A| for every invocation of arbitrary(a; A) and of uniform(a; A). 


b. a ts equal ton ift < |*| and equal to |[E*| if |2|+1<t<p—-n+l. 


t=1 

c. [S;| =n. 
d. For every j,1 <j <t, Cj = S) 1 (Sj41U~...U S,)*. In particular Cy = S;. 
e. Co = (Uj_155)* = (Uj C5)”. 
f. Co,Ci,...,C; form a partition of |p]. 
g. [fl <t< |[F| then |C\|=...=|Ci| =n. 
h. Tf (F] +1<t<p—nt+1 then 

1.) =9 

2-i. For every j, 1< jy <t—1, |C;| ts equal to [*| or [H*|4+1 

2-ii. There exists j, 1 <j <t—1, such that |C;| = |=]. 


t-1 


3. |C;| is equal to n. 


i If (F|+1<t<p-—n+1 then J = {i € [t- 1]; |G] = a+ 1} and |J| = 
p—n— [B22 |(t= 1). 


Proor. For every program variable X, we let X(t) denote the value held by X 
at the end of round ¢. We extend this definition to t = 0 and let X(0) denote the 
initial value of any program variable X. 


We easily see that, for every ¢, the program variable $; is changed only in round ¢. 
Hence 5;(t) = Sj(t{+ 1) =...= S;(n—p+1) so that we can abuse language and let 
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5, also denote the value $,(t) held by the program variable 5; at the end of round ¢. 
(We will make explicit when 5; refers to the program variable and not to the value 


S,(t).) 


Invariant b. For every round t,1 < t < |p/n|, the program variable a is left 
unchanged: a(t) = a(0) = n. For every t,|p/n| +1 <t<p—n+1, the program 
variable a is left unchanged if equal to |=" | and reset to this value otherwise. (See 


lines 10, 11 and 27). Hence a(t) = |=" |. This establishes invariant b. We prove 


1 
the other invariants by induction on the round number t. 


Case A. Consider first the case 0 < ¢ < [£]. In this case we furthermore establish 
that $),...,5; are all disjoint, that C(t) = $; for every 7,1 <j <t < |£| and that 
C(t) is a set of size p—tn. This is true for tf = 0 as Co(0) = [p] and as, in this case, 
the family $,,...,.5;is empty. Assume that all the invariants are true for some round 
t-—1,0<t-—1< |]. Consider round t. In line 02 of the program, Co has value 
Co(t—1), which is of size p—(t—1)n, by induction. Ast—1 < [2], p—(t-l)n >n 
and hence the invocation $,; := arbitrary(n;Co) is licit: invariant a is satisfied 
for round t. Hence S; is a well-defined set of size n and invariant c is satisfied for 
round t. As 5; is a subset of Co(¢), invariant e (for ¢— 1) shows that 5; is disjoint 
from U}-'S; so that $,,...,.$; are all disjoint. In lines 02-04 the program variables 
Cy,1 <j <t—1are not changed. Hence Cj(t) = Cj(t-1) = $;,1 <j <t—-1. On 
the other hand, by line 04, C,(t) = 5;, and by line 03, Co(t— 1) is the disjoint union 
of Co(t) and of S;. Hence |Co(t)| = |Co(t — 1)| — |S¢| = (p - (- ln) -—n = p— tn. 
From these properties we easily check that the invariants d, e, f and g are true for 
t. 


Case B. We now turn to the case [p/n| +1 <t<p—n+1. Assume that t is such 
an integer and such that all the invariants are true for ¢ — 1. 

Case B-I. Assume |"*] < a in line 10 is true. 

We first establish that t = |p/n| + 1 falls in case B-I. We just proved that all the 
invariants a, ..., g hold for ¢-1 = |p/n|. By Lemma 7.4.2, [="| <n. On the 
other hand, in line 10, the program variable a has value a(t—1) = a(0) = n. Hence 
the precondition [*| < a in line 12 is true for ¢ = [p/n] +1. 


Invariant a. We easily check that for every t,0 <p—n—-|=*®|(t#-1)<t-1. 


(For every numbers a and y, « — |a/y|y is the rest of the Euclidean division of x 
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by y.) Hence the invocation J := arbitrary(p—n— |#=*|(¢— 1);[t—1]) of line 12 
is always licit (for ¢ = |[p/n| +1,...,p—n+41). 


In line 17 and in line 23 the program variable Cx has value Cx(t— 1), a set of size 
a(t — 1) or a(t — 1)+ 1 by invariant b and h for ¢—1. Recall that, by assumption, 
the precondition of line 10 is true: a(t) < a(t — 1) ie., a(t) +1 < a(t —1). This 
implies that the invocations S := uniform(a + 1;Cx) and S$ := uniform(a; Cx) of 
respectively lines 16 and 22 are licit. This shows that the invariant a is true for t. 


Invariant h-1. The variable Cy is set to 9 in round |[p/n| + 1 and never altered 
afterwards so that Co(t) = @ for t = |p/n| + 1,...,p-n+ 1. This establishes 
invariant h-1. 


Invariants d,e and f. By assumption, (invariant f for f—-1), Co(t-1),...,Cr_1(t-1) 
is a partition of [p]. The set S; is obtained by first taking all the elements of Co(¢—1), 
(which is non-empty only if ¢ = |p/n| +1), and then, in lines 16, 17, 18 and 22, 
23, 24, transferring some subsets of Co(t — 1),...,C:_1(¢ — 1) into S;. Hence, by 
construction, the sets C)(t),...,C:_i(¢) and 5S; are a partition of [p] i-e., invariant 
f is true. This implies also that Cj(¢) = C)(¢- 1) — S; = Cj(t — 1) 0 S/ for every 
j,i <j <t-1. By invariant d for t-1, we have Cj(¢-1) = $;N(S,U...U S4_1)°. 
Hence Cj(t) = S$; 1 (S$, U...U S;)° for every j,1 < 7 < t-1. Furthermore, by line 
26, C(t) is equal to $,. Hence invariant d holds for t. We also easily deduce e. 


Invariants i and h-2. By lines 16, 18 and 22, 24, for every 7,1 <7 <t-—1, the 
quantity |Cj(¢)| is equal to a(t) or a(t) + 1. This along with invariant b already 
proven shows that invariant i is true for t. Furthermore, the set of indices 7,1 < 
j <t-1, such that C(t) is of size a(t) + 1 is the set J determined in line 12. 
(See lines 12 through 19 of the protocol.) This set is equal to the value allocated 
to J in line 13. As J is not further changed in round ¢ this value is J(t). This 
establishes invariant h-2-i. As mentioned in the proof of invariant a, for every ft, 
p—n—|[-*|(@-1) <t-—1. Hence the value allocated to J in line 12 is not the 
whole set [¢ — 1]. Consequently, at the end of line 13 J’ is not equal to 9 and for 
every k € I’ the while loop from line 20 to 25 produces a set C(t) of size a(t), 


ie., by invariant b, a set of size |" |. This establishes invariant h-2-ii, 


Invariants c and h-3. Let 6 denote p—n— |=*|(t— 1) and let a denote t— 1-6. 
Note that a > 0. We just established that the family Cy(t),...,C+_1(¢), 5; is a par- 
tition of [p]. Therefore, C,(¢),...,Cr-1(4) is a partition of [p] — 5;. By invariants 
h and i this partition is composed of 6 elements of size a(t) + 1 and of a elements 
of size a(t). Hence p— |5;| = ¢ = aa(t) + b(a(t) +1). On the other hand, by 
Lemma 7.4.3, p—n = aa(t) + b(a(t) + 1). This shows that |.$;| = n and hence, by 
line 26, that |C,(1)| = n. 
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Case B-II. Assume |*] < a in line 10 is false (i.e., [=| = a in line 27 is true). 


Invariant a. In line 28, the value of the program variable C,_, is C;_1(t — 1) which 
is of size n by invariant c and d fort—1. Ast > |p/n| +1, by Lemma 7.4.2, 


|" | +1 <n which shows that the invocation $ := uniform(a + 1; C;_;) is licit. 


Note that, in line 27, a has value a(t — 1) so that the condition |="| = a of line 27 
exactly means that [=*| = a(t — 1). By invariant b, a(¢ — 1) is either n or [=F]. 
We prove that only the latter form arises in the equality [=?| = a(t — 1). Recall 
the two following facts established in the proof of invariant b given in page 157. 1) 


limp! < ™- 2) a([p/n]) =n. From these two facts we deduce that the equality 
[7] = a(t — 1) does not hold for t = [p/n] +1 as [FF] <n =a(|p/n|). Hence 
equality in line 27 can occur only for t > |p/n| +1. By Lemma 7.4.2, t > |p/n| +1 
implies that [-=*| <n and hence that a(t — 1) = [_}]. To summarize: equality 
holds in line 27 only for ¢ > [p/n] + 1 and then implies that |=" | = |=} ]. 


2 
In line 31, the value of the program variable J is J(t—1). By induction, (invariant i 
for t—1), J(¢—1) is a subset of [t— 2] which is of size |J(t—1)| = p—n— |=} |(t-2). 
The element t—1 is not in J(t—1) (see invariant i for ¢—1) so that | J(¢—1)U{t-1}| = 
|J(¢—1)|+1. Recall that in line 31, the program variable a is equal to a(t) = |=’ ]. 
We have: 


o()$1 = [22] 41 
< p-n—|E§|(t-2)+1 (by Lemma 7.4.4) 
= \y(t-D)+1 


JJ(t— 1) U {t— 1}. 


This shows that the invocation J’ := arbitrary(a +1; JU{t—1}) of line 31 is licit. 


Line 34 is within the while-loop originated in line 33. Each of the invocation 
kK := arbitrary(1;/) of line 34 is licit as occurring while J is non-empty. 


In line 35, Cx is of size a(t — 1) +1 because, by lines 31 and 34, WK is an element of 
J(t—1)U{t—1} and because, by invariant i, Kk € J(t—1) implies that |C_(t-1)| = 
a(t—1)+1 which is a(t) +1 by line 27 —if kK = ¢—1, lines 28 and 30 show directly 
that |Cx| = a+1. This shows that the invocation S := uniform(a; Cx) in line 35 
is licit. 
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This finishes to establish that invariant a is true for f. 


Invariants c and h-3. In lines 28, 29, |C;_1| — (a(t -— 1) +1) = n-(a(t-—1) +1) 
elements are allocated to S,. Let [/,, denote the value held by J’ at the end of line 
31. In the while-loop (line 33 to 38), a(t — 1) + 1 additional elements are allocated 
to S;. (a(t — 1) 4 1 is the size of J/,,.) Hence a total of n elements are allocated to 


S; in round ¢. This shows that |.$,| = n and hence, by line 39, that |C,(¢)| = n. 


Invariants d, e, f, and h-1. As in the Case I, C;(t) = Cj(t — 1) — 5; for every 
jZl<j<t-l,and C,,...,C_1(t), S; is a partition of [p]. As in Case I, this implies 
that invariants d, e and f hold for ¢. The proof for invariant h-1 is also the same 
as in Case I. 


Invariant i and h-2. By invariant h-2 for ¢— 1, for t— 1, all the sets Cy(¢-2),1 < 
j <t-— 2, are of one of the two sizes a(t — 1) and a(t — 1) +1, Le., (recall that the 
condition of line 27 is true), of size a(t) or a(t) +1. Also, by invariant i for t — 1, 
J(t—1) is the set of indices 7,1 <7 < ¢—2 for which Ci(t— 1) is of size a(t-—1)+1 
ie., of size a(t)+1. At the end of line 30, the set of indices 7,1 <i < t—1 for which 
the value of C; is of size a(t)+1 is the set J(¢—1)U{t—1}. Let I ,, denote the value 
held by I’ at the end of line 31. In the while-loop of line 33, (finishing in line 38), 
for every index k in [/,,, an element is transferred from Cy to S;. Hence, at the end 
of the while-loop all the sets C;, 1 <i < t—1 are of size a(t) or a(t)+1. This proves 
invariant h-2 for ¢. Furthermore, in the while-loop of line 33, the set of indices 7 
for which the value of C; is of size a(t)+ 1 is reduced to J(t—1)U{t-1}-—,,. The 
value J(t—1)U{t-—1}-—J£,, is the value J(t) given to J in line 32. (J is not altered 
further in round ¢ and hence the value allocated to J on line 32 is the value J(t).) 


This establishes the first part of invariant i: J(t) = {¢ € [f — 1]; |Ci(t)| = a(4) + 1}. 
We compute |J(¢)|. 


IW) = |=) +1 [Ea 
= |sJ(t- +1 = (a(t) +1) 


t-1 


- [SB I(¢-2) 
= pon [22 \(t—2)— a(t) (because a(t) = "| = 8] ) 
UBIO) 


This finishes the proof of invariant i for ¢. 


Lemma 7.4.2 Lett be a positive integer. Then |->"| < n if and only ift > |p/n|. 


t 
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Proor. Write p=an+6 with 0<b<n. Assume that t > |p/n|. Then 
|A*|< Fond =(ant+b—n)/a=n+(b—n)/a<n. 


We now prove that [1 <t < [p/n] — 1] > [=*] > n. If [p/n] = 1 the implication 
is trivially true. Assume that |p/n| > 1. We have: 


et) > | eS] = (an + 0-n) (a1) =n tb/(a-1>n 


Lemma 7.4.3 For every integers q and t the system of equations 


q=aa+ bat+1) 
at+b=t 
a>0,b>0 


has exactly one integer-valued system of solutions: a = |£|, b = q — |4£|t and 
a=t—b. 


Proor. Uniqueness: The equation g/t = (a/t)a + (b/t)(a +1) shows that q/t 


is a convex combination of a and a +1. Hence a must be equal to [4]. Also, the 


equation g = (a+b)a+b=ta+ b shows that b must be equal to q— t|4]. 


Existence: Write q/t as the convex combination of a = [4] and of a+1 = [4] +1: 


t 
q/t = wa+v(a+1) with ut+v=1. We have g/t = (u+vja+v=a+v= ea +. 
This shows that v is equal to g/t — |4| and that u=1—v=1-q/t+ [4]. Hence 


t 
we can chose 6 = vt = q— |4|t anda =ut=(1—v)t=t—6. 


Lemma 7.4.4 Let n, p and t be three positive integers such that n < p. Assume 
that [fy] = [faz]. Then [TV] <p—n— [aI — 2). 
Proor. Obviously, 0 < p—n—|=*|(t—-1) so that [E’| < p—n—|[E'*|(¢-1)+ 


I 1 
|2=" |. By assumption, |=" | = [22] and hence |=" | <p—n—-|E*\(t 


We now establish that, for every t, the quantity Tz: |C; (¢)| is deterministic i.e., does 
not depend on the successive values returned by the randomized invocations arbitrary 
and uniform. This means in particular that, for every t, the quantity Ta |C;(t)| 

is the same for all prog in Prog, and does not depend on the values taken by the 
random variables S;, J,, Jo,... 
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Lemma 7.4.5 For every prog € Prog, for every t,1 <t <p—n+1, the prod- 
uct Tn |Cj(4)| is uniquely determined as follows. (For conciseness we let a = 
p—-—n—-|E*|(t-1) andb=t-1l-—a=t-1l—-pt+n+|E*|(t-1).) 


jai lCi()) = nt ift< [Ff], 
= ] +1<t<p-—-n4l1. 


Proor. If t < [*| the result is an immediate consequence of invariant g. If 
[4 +1<t<p—n+1 the result is a consequence of invariants h and i. 


7.4.4 The probability space associated to Prog, 


By definition, a program prog in Prog, has randomized invocations. The output val- 
ues of prog are the final values of 5;,...,5,. The randomized invocations are internal 
actions. For each prog in Prog, we can construct a probability space (Q’,G’, Porog) 
allowing to measure (in a probabilistic sense) not only the output — the schedules 
S1,+.+,5» — but also all the randomized invocations made by prog. 


We will need in Section 7.7 to analyze programs in Prog, and to manipulate some 
events from (Q/,G’, Porog). We therefore describe informally the construction of this 
probability space. The sample space 2’ contains the sequences of values produced 
by randomized invocations during executions of Prog,, i.e., the realizations of the 
sequence (randomized-invocation,, randomized-invocation,,...). The o-field G’ is 


the power set of 2’. The measure P,,,, is the measure characterized by the relations 


P. 


| - | past randomized invocations are r,,.. rs| = Payor. 


one for each sequence r,,...,7;. By integration this means that, for every sequence 
T1,-+-,Tt, the probability P..og[(T1,---,7+)] is given by P.[ri] P., [re] +++ Pra a [rel 
where we let P, denote the the probability attached to the empty sequence e. 


In contrast, recall that in Section 7.2 we constructed the sample space associated to 
protocols 7 to contain the sequences of values taken by random schedules i.e., the 
realizations of the sequence (5), .$5,...). As each set 5; outputted by prog can be 
constructed from the sequence 

(randomized-invocation,, randomized-invocationg,...), 
we see that the o-field G’ is bigger then the one considered for protocols. For every 
prog in Prog,, the measure P,,., extends the measure 7,,,, defined on the set of 


schedules produced by prog onto this bigger space (0’,G’). 


Recall that, for every t, every t-schedule o and every t-fault sequence adapted to o, 
the conditional probability a[- | S; = o] is well defined whereas a[- | S; = 0, F; = ¢] 
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is not. This is because the event {F; = $} is not expressible in terms of the schedule 
51, 52,... and hence is not in the o-field over which the probability distribution z 
is defined. In Lemma 7.3.3 we formally made sense of this conditional probability 
for events in Gj, i.e., for events describable in terms of the set 5,,, and in terms of 
the decisions o and ¢ taken up to round t by both the protocol and the adversary. 
We let P,[- | S; = 0, F; = $] denote this extension. 


We can make a similar extension with P,.,. so as to compute the probability of 


prog 
events depending on the randomized invocations done by prog up to round ¢+ 1 —- 
the current round — and of past decisions (i.e., up to round t), conditioned on the 
past decisions taken by both the protocol and the adversary. To simplify we use 
also P, 


prog 
will consider the expression 


to denote this extension. For instance, in the proof of Lemma 7.7.4 we 
Pryrog T>t+1 | S=0,F, = 6,7 >t ttl =J| . 


7.4.5 The optimality property 


The optimization problem sup, inf, E, 4[T] is called the primal problem. We let t,,: 
denote its value. Similarly the optimization problem inf, sup, E, 4[T] is called the 
dual problem and we let t(,, denotes its value. A protocol 7,,, solves the primal prob- 
lem if t.., = t(topt). ° The existence of such a protocol implies in particular that the 
sup is attained in the primal problem and that max, inf, EF, 4[7] = infy F,, 4[T]. 
An adversary A,,, solves the dual problem if t),, = t/(A.p:) = sup, Ex Aop ll]. The 
existence of such an adversary implies in particular that the inf is attained in the 
dual problem and that miny sup, F,,4[T] = sup, Fx.4,,17]- 


The following adversary A, plays a fundamental role in the understanding and the 
analysis of protocols in Prot (Prog). 


Definition 7.4.1 We let A, denote the adversary that selects at each round t an 
element chosen uniformly at random from s,—{f,,..., fr_-1} if this set is not empty, 
and selects an arbitrary element from s, otherwise. 


This formally means that Ap = (Qz)ucv where V is the family of t-odd-executions 
and where, for every v = (51, fi,..-,5¢), Qv is equal to U.,-F7,,47,_,3 When 5; — 


{fi,.--,f:-1} #9 - and Q, arbitrary when s,—{f,,...,fi-1} = 9. 
Our main result is described in the next theorem. 


°We presented in Page 143 the definition of of ¢(7) for a protocol 7: t(7) = inf Ex,a[T]. For 
an adversary A, t'(A) is defined symmetrically by t’(A) = sup, Ex,a[T]. 
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Theorem 7.4.6 Every protocol x in Prot (Prog,) solves the primal problem while 
the adversary Ag solves the dual problem. These two problems have the same value, 
equal to E,, a,[1]- 


The two following lemmas are the crucial ingredients to the proof of Theorem 7.4.6. 
We defer their proof after the one of Theorem 7.4.6. The first lemma expresses that 
the protocols in Prot(Prog,) are optimal against adversary Ao. 


Lemma 7.4.7 Let 7 be a protocol in Prot(Prog,). Then max, Ez .4,(1] = Ex,,ao[1]. 


The next lemma expresses that, when a protocol a» in Prot(Prog,) is used, the 
expected time of survival of the system is independent of the adversary A used 
in conjunction with 7). This implies in particular that Aj is optimal against the 
protocol 7. 


Lemma 7.4.8 Let % be a protocol in Prot(Prog,). Then E,,,.4[1] is independent 
of A. 


We are now ready to prove our main result. 
ProoF of Theorem 7.4.6: 


Let a be a protocol in Prot(Prog,). Then 


sup, Ex agll] = En aoll] (by Lemma 7.4.7) 
= infy F,, 4[T] (by Lemma 7.4.8). 


By Lemma 8.2.2, sup, infy F, 4[7] = inf E,, a[T] ie., to solves the primal prob- 
lem, and similarly Aj solves the dual problem. Furthermore these two problems 
have the same value, equal to F,,4,[T]. 


Note: As discussed in Section 7.1.2, the equality of the values of the primal and 
the dual problem is a direct consequence of Von-Neumann’s theorem. 
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7.5 The Random Variables C;(t) and L;,(t) 


In Section 7.4, for every program prog in Prog,, for every t and j, 1 <j <t <p, we 
defined C;(t) to be the value held by the program variable C;; at the end of round t¢ 
of prog. In this section we define the values C;(t) to be random variables expressed 
in terms of the schedule S;. This definition is valid for arbitrary protocols and is 
compatible with the definition given in Section 7.4 in the special case of protocols 
in Prot(Prog,). (See invariant d of Lemma 7.4.1.) 


For each j and t, 1 < j < t, C;(t) is the set of elements selected at time j and which 
are not used in ulterior rounds 7 + 1,7 + 2,...,¢. Co(t) is the set of elements of [p] 
that are not used in the first ¢ rounds. We also introduce the random variables L;(t) 
counting the number of sets C(t), 1 < 7 <¢-— 1, which are of size 2. 


An indication of the relevance of these random variables is indicated by Lemma 7.4.5 
which expresses that every prog in Prog, is such that for every t,1 < t < [p/n], 
L,(t) =t-—1 and L(t) = 0 if i A n; and such that for every t,t > |p/n| + 1, 
L(t) = p—n—|S2|(t—1) fi = [S*) $1, L(t) =t-1—ptn+ |Z2\(t- 1) 
ifi = [="| and L,(t) = 0 for any other value of 7. We will see that these properties 
do in fact characterize optimal schedules. The following definition formalizes these 
notions. For a set ® in {1,...,p}, we use the notation ®* to denote the complement 
of ® in {1,...,p}. 


Definition 7.5.1 For every integers j andt, 1<j<t<pand everyk,O<k<n 
we define the following random variables: 


C(t) = Sn (S.U...Us,) 
C(t) = Son (S3U...US,) 


Cit) = $;9 (Sia U...U 5.) 
C(t) = Si. 
We extend the definition of Cj(t) to 7 = 0 and set 


We say that (Co(t),..., Ci(4)) is the c-sequence derived from the schedule (51,..., 5). 
For1<t<pand0<i<n, we let L;(t) denote the number of sets C;(t),1 <j < 
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t—1, which are of size i: 


L(t) = 


{cyt}; 1<j<t-1 leo = a], 


We extend the definition of L;(t) toi =n-+1 and set 


Enar(t) © [Colt)] « 


As usual, we use lower case letters and write for instance co(t), c1(t),..., (4) to de- 
note the realizations of the random variables C(t), Ci(4),..., Cx(4) and [o(t), .. -, dng (4) 
to denote the realizations of the random variables Lo(t),..., Pn4i(t). 


The next properties are simple but fundamental. Property 3 means that the system 
is still alive by time ¢ if and only if for every 7,7 <¢—1, the j-th fault is in the set 
C;;. Let us emphasize that this property would not hold for m > 2. 


Lemma 7.5.1 Let t,1<t< p be arbitrary. Then: 


1. The family (Cj(t))o<j<e forms a partition of |p]. 


3. {T >t} = LB € Gy} (= Ma fSu {Fi Fin} = 0). 


Proor. We easily check that the family (Cj(t))o<j<: is a partition of [p]. The 
condition )>;_, L(t) = t — 1 just expresses that the ¢— 1 sets Ci(t),...,Cr-1(t) 
are different and of cardinality between 0 and n. As expressed by the formula 
{T > th} = Na {SuN{Fi,...,Fu-i} = O} that we recall for completeness, the 
survival time 7 is at least ¢ if and only if, for every time u,2 < u < t, S, does 
not contain any of F\,...,Fy_1. (Note that this fact uses the hypothesis m = 1.) 
Equivalently, 7 > tif and only if, for every 7,1 <7 <t—1, Fj is not contained in any 
of Sj41,..-, 4, Le., if FF € (S410... .U5;)°. On the other hand, by definition, F; € 5; 
for every 7. Therefore T > ¢ if and only if F; € C;(t) for every j,1 <j <t—-1. 


Definition (7.5.1) establishes how a c-sequence (c;(t)); can be derived from a given 
schedule (s;,...,5;). The following lemma conversely characterizes the sequences 
(y;); that can be realized as a c-sequence from some schedule (s,,...,5;). This 
result will be used in Lemma 7.6.11 to characterize the optimal schedules. 
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Lemma 7.5.2 Let y1,...,%4-1 be integers in the interval [0,n]. Then the condition 
noN—-(y+.---+%-1) 


is a necessary and sufficient condition for the existence of a schedule (s1,..., 5+) 
satisfying |c;(t)| = y; for allj, 1 <j <t-1. 


Proor. We first establish the necessity of the condition. By Lemma 7.5.1, for all 
schedules (s;,...,5;), the family (¢;(¢))o<j<: forms a partition of [p]. This clearly 
implies that 3% |e;(t)| = p and hence that 37) |e;(t)| < p. 

Conversely, we prove by induction on ¢ that, for every sequence 71,...,7:~-1 of 
integers in [0,n] such that n < p—(y¥1+...+ 7-1), there exists a schedule 
(s1,...,5;) whose associated c-sequence is given by ¢;(t) = y;,1 < 7 <t—1 and 
eo(t)=p—(yt...+%-1) — 7. 

e The property is trivially true for ¢ = 1: in this case the family of conditions 
|c;(t)| = 7¥;,1 <7 <t—1 is empty, and the set co(1) of processors not used has size 
p—|si|=p-n. 


e Assume the property verified for ¢, and consider a sequence 7,,..., 7; of integers 
in [0,n] such that n < p—(y1+...+%). This condition trivially implies that 
n<p-—(m+..-+%-1). Therefore, the result at the induction level t being true, 
there exists a schedule (s;,...,5;) for which ¢;(¢) = 7;,1 <7 <t— 1 and such that 
the number of processors still unused at time t¢ is 


|eo(t)| = p (1 b..e 1-1) nm. 


We then construct a set 5,4, of n processors by taking any n — 7, elements from the 
set s; and any 7, elements from the set co(t) of unused processors. This construction 
is possible exactly when 7; < |co(t)| i-e., when 


WS p-—(mt..-tN-1) -— 27, 


which is true: this is the induction hypothesis. Note that, by construction, the sets 
c;(t), 1 <7 <t-1 are unaffected by the selection of the set 5,4, and hence: 


le(t+ 1] =lel=y, b<j<t-l. 
On the other hand, 


lec(t + 1)| [si] —(m — %) 


= Vt - 
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To finish the induction from ¢ to t + 1, we note that 


leo(t + 1)| leo())| — % 


p-—(jwt..-+H-1tEN)-N. 


The schedules in normal form that we now define will be shown in the sequel to be 
the schedules maximizing the expected survival time. In essence, a ¢-schedule o is in 
normal form if the sizes of the sets c;(t),1 <j <t—1, (of the c-sequence associated 
to a), differ by at most one, and if co(¢) is not empty only if every c;(t),1 <7 <t-1 
is of size n. Formally: 


Definition 7.5.2 Let t > 1 be a time. We say that a t-schedule o is in normal 
form and write 


GEN, 


if there exists a,, 0< a, <n, s.b., 


VWi,b<j<t-1, |e (|) =a or lg()|=a4+1, 
len(t)| >O Sa =n. 


We then also say that (co(t),..., ¢-1(4)) ts in normal form. 


Invariants g and h of Lemma 7.4.1 express that the programs in Prog, all produce 
schedules in normal form. We next show that a t-schedule o is in normal form if and 
only if the associated sequence (lo(t),...,¢n41(t)) has at most two non-zero terms, 
which are consecutive. 


Lemma 7.5.3 A t-schedule o is in normal form if and only if there exists a,,0 < 
a, <n+1 such that the associated sequence (lo(t),...,ln4i(t)) satisfies the equality 


(Jolt), stnailt)) = (0,.6.,0,las(O)staeti(t),0,---50) - 


Proor. Simple consequence of Definitions 7.5.1 and 7.5.2. 


The next lemma shows that, for every t, t-schedules in normal form have a unique 
associated sequence (/o(t),..-,4n41(¢)) and that, conversely, this value characterizes 
t-schedules in normal form. We will use this property in Lemma 7.6.11 to show that 
a property is specific to schedules in normal form. 
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Lemma 7.5.4 1. Let 1 <t < |p/n|. Then a t-schedule o is in N, if and only 
if the only non zero terms of the sequence (lo(t),...,Un4i(t)) are [,(t) = t-— 1 and 
lngi(t) = p—tn. (And hence, a, =n.) 


2. Lett > |p/n|. Then a t-schedule o is in N, if and only if the only non zero 
terms of the sequence (lo(t),...,¢n4i(t)) are 


Io,(t) = |EF]@-D+tt+n-p-1 


and, ifp—n is not a multiple of t — 1, 


lagi(t) = p—n— |e |(t—1), 


where 


Proor. e If t < |p/n| then tn < p. Working by contradiction, assume that 
there exists i < n such that [;(¢) > 0, Le., that there exists a set c;,(¢) such that 
Ic;,(t)| < n. Then 


| UJ c;(t) 


t 
< So lej(t) 
j=l j=l 
= S- le;(t)| + Je;,(4)| 
JFJo 
< tn<p. 


As the family (c¢;(t)),0 < 7 < t, is a partition of [p], co(t) must be not empty which, 
by the normality property, implies that all the sets c;(¢),1 <j <¢— 1, must have 
size n. Hence 


I(t) = |{e(t); 1<j <t- lL le(t)| = n}| 
= t-l. 


This in turn implies that 


lI 
3 
| 
3 
o™ 


e Consider now the case t > |p/n|. Working by contradiction, assume that l,41(t) > 
0. Then, by the normality condition, all the sets c;(t), 1 < 7 <t—1 must have size n. 
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We have: 
t t 
| UJ cj(t)| = S- |c;(t)| (since the sets c;(t) are disjoint) 
j=1 =1 
= in 
= ([p/n] + 1)n 
> ?P,; 


a contradiction. Hence l,,,;(t) = 0, ie., co(t) = 0. Hence the t—1 sets c,(t),..., ci (4) 
must divide p—|s,| = p—n elements among themselves. Lemma 7.4.3 (where we re- 
place q by p—n and t by t—1) shows that a, = [P= | da (t) = [REP |(t-1)4+t+n-p-1 
and la,4i(t) = p—n—[E*|(t- 1). 
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7.6 7 is optimal against Ap 


This section is devoted to the proof of Lemma 7.4.7. 


7.6.1 Sketch of the proof and intuitions 


For a given adversary A, an optimal protocol is one that, at each time t+ 1, exploits 
to its fullest all the information available so as to optimize the choice of 5,,,. There- 
fore to construct an optimal protocol, we first analyze the notion of “information 
available to the protocol”. We distinguish between the explicit information, for- 
mally described in the model, and the implicit information that an optimal protocol 
is able to deduce. Consider for instance the case of the identity of the adversary 
A. In Section 7.2, when modeling protocols and adversaries, we did not provide a 
mechanism allowing a protocol to be informed of the identity of the adversary that 
it plays against. This means that the protocol does not know explicitly the identity 
of the adversary. Nevertheless, for a given adversary A, there is one protocol that 
always assumes that the adversary is A and takes optimal decisions based on this 
assumption. This protocol is by construction optimal if the adversary is A. We 
then say that the optimal protocol knows implicitly the identity of the adversary. 


In Section 7.2 we modeled a protocol to be an entity deciding the whole schedule 
(S1,...,5,) ahead of time, i.e., in an off-line fashion. $j is the first selected set, 
and for every t,t > 1, 544, is the set selected to be used after the occurrence of the 
t-th fault. In this model the adversary “sees” the sets $, only when they come in 
operation. 


Alternatively, we could have modeled a protocol to be an entity interacting in an 
on-line fashion with the adversary: in this model, at each occurrence of a fault, the 
adversary informs the protocol of the occurrence of a fault and whether the system 
is still alive at this point. If the system is still alive the protocol then selects the set 
S41 to be used next and communicates its choice to the adversary. 


It might seem that in our model — the off-line model — the protocol is weaker then 
in the on-line model, as, for every t, it has to select the set 5:4, without knowing 
whether the system is alive after the ¢-th fault. Nevertheless we easily see that, in 
the off-line model, the protocol can assume without loss of generality that T > ¢, 
i.e., that the system is alive after the t-th fault, while selecting the set 5,4, for time 
t+ 1. Indeed, if this happens not to be the case and the system dies before time 


t+ 1, all the decisions and assumptions made by the protocol for time t+ 1 are 
irrelevant. 
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This discussion shows that the off-line and the on-line model of a protocol are 
equivalent so that we can adopt the on-line model in this informal presentation. 
From the on-line description, it is clear that, for every ¢, the information about the 
past execution available to the protocol upon selecting the set 5,4, consists of its 
own past decisions, i.e., of S; = o, and of the fact that the system is still alive at 
this point, ie., of 7 > ¢. Note that the information T > ?¢ is given explicitly to the 
protocol in the on-line model and only given implicitly in the off-line model that we 
chose in Section 7.2. 


For every ¢, upon selecting a new set 514;, an optimal protocol can use all the 
information available and guess first what are the locations F,,..., F; of the faults 
already committed by the adversary. To “guess” the protocol uses the probability 
distribution 

Pal(h,...,F)=-|T >t, S=o], 


ie., the probability of the value allocated to (Fi,...,F;) by the adversary A, con- 
ditioned on the knowledge of the past execution held by the protocol. 


By Lemma 7.5.1, the system is alive at time ¢, i.e., 7 > t, exactly if F; is in Cj(¢) for 
every 7,1 <j <t-—1. (This is not true for m > 2.) Hence the previous probability 
can be rewritten Pal(1,...,f) = - | uth € o(t)}, S: = o], which shows 
the relevance to the analysis of the sets C(t). In this section the adversary is the 
random adversary A, defined in Definition 7.4.1. In this case, by definition, at each 
time t, a fault occurs uniformly at random in $; and we can then further establish 
in Section 7.6.2 that Py,[U1,...,f&)=-|T>t,S,;=o]= jar Ue,(2)- 


This result fully elucidates the notion of “information available to the protocol” and 
we can say that a protocol optimal against Ap is one that, for each ¢, uses most 
efficiently the probabilistic guess P4,[(/1,...,/:) = - | 7 > t, S; = o] in order to 
chose a “most appropriate” set $,,, for time t+ 1. The next challenge towards the 
construction of optimal protocols is to understand how, for every ¢, such a “most 
appropriate” set 5,41 is selected. For this we use the general equality 


E,,alT] = So Pr Ag ll 2 t] ? 


t>l 


established in Lemma 8.1.2. A natural idea is to try the following greedy strategy. 
Select a set s maximizing the quantity P,,[T > 1 |S: = s]. Then select a set s 
maximizing the quantity P4,[T > 2|T > 1, Ss = (s1,s)]. Generally, assuming that 
the schedule S; = (51,...,5;) has already been chosen and assuming that the system 
is still alive at time #, i.e., that T > t, we select a set s maximizing the probability 


Pal/f>t+1|T >t, Sri = (81,---, 52, 5)] 
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of being alive one more time. If the protocol a defined by this procedure maximizes 
P,.a,(f > t] for every t, it also maximizes the sum )°,,, P,4,[7' > t] and hence is 
a protocol optimal against Ay. As is discussed in Section 7.6.4, this is true if the 
schedule o = (51,..., 5) greedily chosen as described above maximizes the quantity 
Py, [PT >t|S,=o]}. 


We therefore compute this quantity P,,[T > t | S; = co] for every o and show that it 
is equal to Ta stor where the values c;(t), 1 < 7 < t are uniquely derived from a. 
This computation uses critically the relation P4,[(f1,...,f&)=-|T>t,S;=o]= 
jar Ue,(0) discussed above. We establish that the schedules maximizing this value 
are the schedules for which all the associated terms |c;(t)|,1 < 7 < ¢—1, differ by at 
most one, i.e., the schedules in normal form. (See Definition 7.5.2.) By invariants 
g and h of Lemma 7.4.1, all protocols in Prot(Prog,) produce such schedules and 


hence are optimal against Aj. This is formally established in Section 7.6.5. 


The results established in Section 7.6.4 also show that the schedules produced by 
the greedy procedure previously described are in normal form, and hence that the 
greedy procedure is optimal against A). Actually, the protocols produced by the 
greedy procedure are exactly those in Prot(Prog,). 


7.6.2 Analysis of the distribution of the random variables F; 

Let ¢ and 7 such that 0 <t<p—1,and1<j< t. Assume that the adversary 
is the random adversary A, defined in Definition 7.4.1. The next lemma says that, 
conditioned on the past schedule and on the fact that the system is still alive at 


time f, 


e each random variable F; has a uniform distribution on the set c;(¢), 


e the random variables (F;):<j;<, are independent. 


This lemma will be a crucial tool for the estimation of the probability P,4,[T > 
t|S,; = o] done in Lemma 7.6.10. 


Lemma 7.6.1 Letl1 <t< pando &€ Feasy,. '° Then, for every family (®,,..., 4) 
of subsets of [p], 


Pay | (Fiseos Fi) E (®,,...,6,) | T>1,S=0] = Te09[4s] 
jz 


See Page 149 for the definition of Feas 4,. 
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where c;(t),1 <j <t, are the values of C;(t),1 <7 <t, uniquely determined by the 
condition S, = 0. 


Lemma 7.3.5, page 148, justifies the well-formedness of this statement. 


Proor. Define (1,91, P,) to be the probability space (0,G, Pa,). Using the no- 
tions recalled in Definition 8.1.2 we can reformulate the statement of Lemma 7.6.1 
into the concise form: for all ¢, 0 <¢< p—1 and all ¢-schedule o in Feasy,, 


Qe(K 
j=l 


£( (Fi Fi) | T>1,S,=0) 


T>1,S,=0) 


| 
& 


(7.1) 


Equation 7.1 holds vacuously for t = 0: the family (Fj)i<i<, is empty and there 
is nothing to prove. We now work by induction on ¢,0 < ¢ < p— 1: assume that 
Equation (7.1) has been established for time ¢. To progress from t to ¢+ 1, we first 
prove that, for every t-schedule o and and every s € P,(p) such that (0,8) is in 
Feas,,, for every j, 1< 7 < t+, 


LCF | TE t41 Sun = (8) = Ue: (7.2) 


e Consider the case 7 = ¢+ 1. This case corresponds to the processor F,,, selected 
at time t+ 1 by the adversary. We want to prove that, £( Fa | T>t+1Sui.= 


(0,s)) = U.z,,,(141)- Conditioning on T > t+ 1 ensures that the adversary did not 
select a processor of s at a time prior tot+1. Hence, at the end of time ¢+4 1, 
there is exactly one faulty processor, F,,,, amongst the set s. As Ap is the random 
adversary, F;4, is uniformly distributed in s, i.e., has a law equal to U, = ergs (t-41) 


e Let 7 be any integer, 1 < 7 <¢. We let F; denote the random variable (4 T> 
t,S; = c). Then 
L£(B|T > t + 1, Si44 = (0,5) 
= Pa[ By €-|T ELS =o, Fi €¢s,....R¢s| (7.3) 


Ps. [ Fj €-|T>t5,=0, F ¢s| (7.4) 
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Pa [Be] B €s| 
Uejy| > | FEE s| 


Ue,(syns24, | 


Uc (+1) - 


Equality 7.3 comes from the fact that, conditioned on the schedule S; = o and the 
fact that the system has survived up to time t, the system survives at time ¢+ 1 
exactly when the set s selected at time ¢ + 1 contains no processor broken at a 
previous time. Equality 7.4 comes from the fact that, by our induction hypothesis 
at level t, (F; | T>t,S;= c) is independent from (F: T>t,S,= o); 1l<i< 
t,t # j. Equality 7.5 comes from the fact that, conditioned on {T > t,S,; = o}, the 
condition F; ¢ s is equivalent to FY ¢ s. Equality 7.6 comes from the fact that, by 
our induction hypothesis at level ¢, 


Pal Fy €-] =U] - 


Equality 7.7 is a simple application of Lemma, 8.1.3: again, remember that U.,;,) is 
the law of Fy. Equality 7.8 comes from the definition of c;(¢+ 1). This finishes to 
establish that Equation 7.2 holds for every 7, 1 <j <t+1. 


A consequence of Equation 7.2 is that, for every 7, 1 < 7 < ¢, the support of 
(A; | T>t,S,= c) is equal to c;(¢). By Lemma 7.5.1 the sets c;(¢) are all disjoint. 


This trivially implies the independence of the random variables and concludes the 
proof by induction of Formula 7.1. 


The proof of the next result can be omitted in first reading. It will be used in the 
next section and in Lemma 7.6.11 where optimal schedules are characterized. 


Corollary 7.6.2 Let o be a schedule and let (Io(t),...,lngi(t)) be the associated 
sequence as described in Definition 7.5.1. Then o is in Feas,, if and only if p(t) = 0. 


PROOF. 


e Assume that o € Feasy,. Then, by Equation 7.1, 


L(A, FB) | T>t& <0) = 6) Ulaya. 
t=1 
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The uniform distributions U.,;,) are well defined probability distributions only if 
c(t) AO for 7,1 <j <t—1,ie., if (p(t) = 0. 


e We now prove the converse and assume that (s,,...,s;) is a schedule which is not 
in Feas,,. We want to establish that [p(¢) > 0. 


Note first that every set s, of size n is clearly in Feasy,. (Consider the protocol 
mz such that S; = s; for all t. Then P,4,[7 > 1,5) = 5] = Peau [L > 1 = 1, 
because m = 1.) This establishes that t > 1. Let v,1 < uv < ¢ be the smallest 
u such that (s,,...,5,) ¢ Feasy,. Using the formulation given in Remark 1 after 
Definition 7.3.3, page 149, and using Lemma 7.3.2 we obtain 


Pa,|T > 0] & = (81,---550)| = 0 


whereas 
Pay [TP > v1] Sia = (14-22, 80-1)] > 0. (7.9) 


Therefore: 


0 = Py IT >» Sy = (81,.--, Sy) 


= P, IT >» T>v-1,8,= 


| 

IT >0,T>v-1]8, = (81,---550)| 
(s1,.--,50)] ‘Pa,[T > 0-1] 8 = (51,..-,5»)| 
( 


= Ps T>v T>v-1,8= Siy-++580)| Pay [P > v= 1 | Soa = (S15-0 + 8CTelP) 


where the last equality is a consequence of Lemma 7.3.6. 


Using Equations 7.9 and 7.10 allows us to derive the first next equality. 


0 = Pa, [T > 0 | T > v= 1,8) = (81,---50)| 
= Pas[so {Fir Ra} =| T>v-l, Sy = (51,..+550)| 
= Py, Oi ha} =o! T > v= 1, Syn = (815006 8-1)] (7-10) 
_u-l 
= Pa] {K€ 8} | T>v-i, Spat = (Sty-+258y-1)| 


j=l 
v-1 
= [J %.,w-vlst), (7.12) 
j=l 


where Equation 7.11 comes from Lemma 7.3.6 and Equation 7.12 from Lemma 7.6.1. 
The nullity of the product in 7.12 implies that there exists 7,1 <7 < v—1, such 
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that 
U., v1 [S>] =0 
ie, os, Nej(v-1)=0 
ie, oc (v) =O, 


and this implies that [o(v) > 0. To finish, note that the sequence (/o(w)), is a non- 
decreasing sequence so that [o(v) > 0 implies that [9(t) > 0, which was to be proven. 


7.6.3 A max-min equality for the maximal survival times 


Our final proofs will rely on the equality EF, 4[T] = 43, PralT >= t] and will 
entail an analysis of each of the non-zero terms of this summation. This section is 
concerned with finding which of these terms are non-zero. 


Following the definitions given in Lemma 8.2.4 we define t,,..(A) s max{ t; sup,- 


P, a(T > t] > O} and tyin(t) = max{ t; infy P, 4[T > t] > 0}. (Note that, for every 
A and every 7, P, 4[T > t] = Oift > p. This justifies that sup{¢; sup, P,4[T > 
t] > 0} = max{¢t; sup, P,a4[T > t] > 0}. We similarly show that the supremum is 
achieved in tiin(7).) 


We easily see that, for a given adversary A, we can limit the range of summation of 
the series )7,,, P, a[T > t] to within the first ¢,,,,(A) terms. For a given protocol 7, 
the interpretation of tmin(7) is in general a bit more complicated: if infy P,4[T > 
tmin(7) + 1] = ming Pr A(T > trin(t) + 1](= 0), then there exists an adversary A for 
which only tpin(7) terms of the series 57,., P,4[f > t] are non-zero. 


Note that all the values ¢,,,.(A) are a priori bounded by p. Hence Lemma 8.2.4 
shows that the quantities ¢,,.,(A) and tnin(7) satisfy the max-min equality 


max tnin(7) < min tinax(A). 


In this section we show that we can strengthen this inequality into an equality in 
the special case where m = 1. (See Corollary 7.6.9.) Lemma 7.6.8 will also be useful 
in the sequel of this work. 


Let t,t <p be a time, o be a ¢-schedule and ¢ be a t-sequence of faults adapted to 
a. We let t,.4 be the survival time of schedule o used in conjunction with ¢. By 
definition, ¢,,4 is less then equal to ¢. 


Lemma 7.6.3 Let t,t < p be a time, o be a t-schedule and @ be a t-sequence of 
faults adapted to o such that t,.4 =t. Then Ay € Agenerating,(¢). 
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Proor. This a direct consequence of the definition of Aj. At each time, Ap selects 
with non zero probability every element non already selected. Hence Ap selects o 
with non-zero probability. 


Lemma 7.6.4 Let t,t < p be a time, o be a t-schedule and @ be a t-sequence of 
faults adapted too. Then P,a[T > t,S; = 0,F; = 6] > 0 if and only if t,.4 > t, 
a € Pgenerating(a) and A € Agenerating,(@). 


Proor. This is a direct consequence of the fact that, conditioned on S; = o and 
F, =, T >t is equivalent to t,.4 >t. 


Lemma 7.6.5 For every t > p—n4+1 there is no t-schedule o and no t-fault 
sequence @ such that t,,5 > t. 


Proor. Working by contradiction, consider t > p—n+1 and assume the existence 
of such a schedule o and of such a fault sequence ¢. Consider the point just after the 
selection of the t-th fault f,; by the adversary. The hypothesis ¢,,, > ¢ implies that, 
at this point, s, contains n—1 non-faulty elements. On the other hand the ¢ elements 
fi,---,f; are faulty. Hence we must have p > ¢+n-—1 which is a contradiction. 


Lemma 7.6.6 Let t be a time and r a protocol. Then P,4[T > t] > 0 only if 
P,4,[T >t] > 0. 


Proor. The condition P,4[T > t] > 0 implies that there must exist a ¢-schedule 
o and a t-sequence of faults ¢, adapted to o, such that P, 4[T > t,S; = 0,F, = 
@] > 0. By the only if part of Lemma 7.6.4, it must then be the case that t,.4 > #, 
that « € Pgenerating(a) and that A € Agenerating,(¢). By Lemma 7.6.3, Ao € 
Agenerating,(¢). By the if part of Lemma 7.6.4, P,4,[T > t,S; = 0,F; = ¢] > 0 
and hence P, 4,[T >t] > 0. 


In the following lemma, /o(t) is the value related to o as defined in Definition 7.5.1. 


Lemma 7.6.7 Let t,t <p be a time and x a protocol. Then t < tmin(a) if and only 
if there exists a t-schedule o such that I,(t) = 0 and such that P,[S;= 0] > 0. 


Proor. If part. 


P, al T 2 t] Pal Tt 2t,$,=¢] 


Pal[T>t|S=o0]P,[S,= 0]. 


Il IV 
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By assumption, the term P,|S, = a] is positive. On the other hand, by Corol- 
lary 7.6.2, the assumption [o(t) = 0 implies that o € Feasy, ie., that Pa,[T > t | 
S;=0]> 0. Hence 


inf PralT >t] > Pra T >t] >. 
This inequality, along with the definition of t,i,(7) shows that t < tyi,(7). 


Only if part. By definition of tyi,(7), there exists an adversary A such that 
P,al[T >t] > 0. By Lemma 7.6.6, this implies that P,4,[T > t] > 0. Hence 
there must exist a ¢-schedule o such that P,4,[T > t,S; =o] > 0. This implies 
that P,4,[S; =o] > 0 and that o € Feas,, i.e., by Corollary 7.6.2, that [o(t) = 0. 


Lemma 7.6.8 for all.A, for all mo in Prot(Progy), tmin(70) = tmax(A) = p-—n+l. 


Proor. For every adversary A and every protocol 7, the inequality tyin(7) < 
tax(.A) is true for any m > 1 and stems from the general result of Lemma 8.2.4 We 
prove that the converse inequality holds in the special case where m = 1 and where 
the protocol is in Prot (Prog, ). 


Let ¢ << p—n+l and let mt € Prot(Prog,). Every t-schedule o selected by mo is such 
that [o(t) = 0: by invariants g and h of Lemma 7.4.1, for every 7,1 < 7 <t-— 1, the 
set c;(t) has size at least min(n, |+|) > 1. This fact together with Lemma 7.6.7 
implies that tuin(to) > p—n+1. On the other hand, Lemmas 7.6.5 and 7.6.4 
imply that P,~4[T > p-—n+1] = 0 for every protocol 7 and every adversary 
A. This fact implies that tin(7) < p—n+1 and that t,,,(A) < p—n41 for 
every 7 and A. Hence, for every A, tmax(A) < p— n+ 1 = tnin(to). This, along 
with the inequality tin(7o) < tmax(A) previously established implies the equality 
tnin(T0) = tmax(A) = p — n+ 1, valid for every A. 


Corollary 7.6.9 max, tmin(7) = miny tiax(A). 
Proor. By Lemma 8.2.4, max, tmin(#) < miny tmax(A). Conversely, let 7 be any 
protocol in Prot(Prog,). We have: 

min tinax(A) =  tnin(o) (by Lemma 7.6.8) 


<  maxtyin(7). 


It follows that max, tmin(@) = miny tinax(A). 
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7.6.4 Schedules in normal form are most efficient against Ao 


The next lemma is crucial to evaluate the performance of a schedule. In essence, 
for a given schedule o, it allows to compute for each time ¢ the probability that 
the system survives one more time when following the schedule o, provided that 
the system survived thus far and that the adversary is A >. It will allow us in 
Lemma 7.6.11 to characterize the schedules that perform best against Ap. 


Lemma 7.6.10 For every t > 1 and every t-schedule o the two following equalities 


hold: 
Tle 
P4.[T >t|S=0,7 >t-1] =[[ (7.13) 


t-1 no. 
(¢ 
P,,[2 >t|8, =o] = TTS yy (7.14) 
n ; 
j=l 
where c;(u), 1< j <u<t are uniquely derived from o and where, by convention, 
we set 0° to be equal to 1 and 0/0 to be equal to 0. 


Corollary 7.3.5, page 148, justifies the well-foundedness of the probabilities com- 
puted in Lemma 7.6.10. 


Proor. Consider first the case where o ¢ Feasy,. By Remark | on page 149, the 
left-hand side P,,[T > t|S,;= 0] of Equation 7.14 is equal to 0. As 


P, [T >t, S; =] 
P,,|T>t|T>t-1,8,= = 
| 7 | 7 uo o| Pa [T>t-1,8;=0]’ 


the left hand side of Equation 7.13 is also 0. (Note that, by Convention 8.1.1, 
the left-hand side Py,[T > t|S; = 0,T > t — 1] is automatically set to 0 in the 
special case where Py,[/T >t—1,5;=o0] = 0. On the other hand, in this case, 
the convention 0/0 = 0 also gives 0 to the right-hand side.) On the other hand, 
by Corollary 7.6.2, the condition o ¢ Feas,, implies that [,(t) > 0, ie., that there 
exists j,1 <j <¢t—1 such that |c;(¢)| = 0 and hence that Tin |c;(t)| = 0. This 
implies that the right hand side of both Equations 7.13 and 7.14 is equal to zero. 


We can therefore restrict ourselves in the sequel to the case where o = (51,..., 51) € 
Feasy,. 
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a. We begin with the proof of Equation 7.13. 


Pa. [T>t|S 50,7 >t-1] 


= Pa, [ZF € si} | Ses = (51,066,821), >t I] (7.15) 
= T] eal (7.16) 
_ pls falt=DI 

~ me 
a eA) 

= iocesnh (7.18) 


Equation 7.15 is a consequence of Lemma 7.3.6. Equation 7.16 is a simple application 
of Lemma 7.6.1. Equation 7.17 is a simple application of the fact that U.,(-1) is the 
uniform distribution on c;(t — 1). Equation 7.18 comes from the definition of c;(t). 
This finishes to establish Equation 7.13. 


b. We now turn to the proof of Equation 7.14. 
Pa, |T > t| Se = (s1,---580)| 
- Pa, {T >t, T >t-1,...,7 21S, = (s1,---.80)| (7.19) 
= P,,|T>t|T>t-1, Se = (51,...481)| 
Pa[T>t-1|T >t-2, S=(s1,....80)] (7.20) 
_ Pa, {T >2|T 21, Se = (S152+2580)] Pa [T > 1| Se = (S1,.-+80)| 
= P,,{T >t|T>t-1, Se = (S1,.+-480)| 


Pa, [T >t-1|T>t-2, Siar = (815-+-58e-1)| 


... Pas[T > 2|T > 1,82 = (81, 52)] -Pa[T > 1/5: = 51] (7.21) 
t u-l 
le;(w)| 
_ 1G 7.22 
tere (7-22) 
_7 Il Jew) 
j=lu=j+l lej(u ~ 1)| 
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Equation 7.19 is justified by the equality {7 >t}={7T>t,7 >t-1,...,.7>1}. 
Equation 7.20 is obtained by successive conditioning. Equation 7.21 is a consequence 
of Lemma 7.3.6. Equation 7.22 is a consequence of Equation 7.13 and of the simple 
property Py,[7T >1|S, = s,)=1. 


The following lemma establishes that, for every time t,1 < t < tyax(Ao), the set of 
t-schedules o maximizing the probability P4,[T > t|S; = a] is equal to the set Nj 
of t-schedules o in normal form. 


Lemma 7.6.11 For any t,1 < t < tmax(Ao), the value P4,[T > t|S; = o| is the 
same for all t-schedules o in N,. We let Pa, [7 > t| S; € Ni] denote this common 
value. Furthermore, let o' be a t-schedule. We have: 


1. Ifo’ EN, then 
Py (T>t|S:=o'] = maxPs,[T >t|S,=o]. (7.23) 
2. Conversely, if o’ ¢ N,, then 


Pa,[T>t|S,=o'] < maxPy,[T >t|S,=o]. (7.24) 


Proor. By Lemma 7.6.10, that, for every t-schedule a, 


o 
U 
° 


t-1 : n “(t) 
t 
Pa [> t|S =o] = TT A - a 
j=l j 


A first consequence of this fact is that the value P,,[7T > t|S; = 0] depends on o 
only through the values (lo(¢),...,/,(¢)). By Lemma 7.5.4, there is a unique and 
well defined sequence ([o(t),...,/,(t)) to which all schedules in A are associated. 
This implies that the probability P,,[T > t|S,; = 0] takes only one value when o 
varies in N;. This fact justifies the notation 


P4,[T >t|S: € Ni] = P,,[T >t|S,=0', 
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for any arbitrary schedule o’ in Nj. 


Taking into account that all schedules o’ € N; give rise to the same value P4,|T > 
t|S; = o'|, we see that Equation 7.23 is a consequence of Equation 7.24 that we 
now set out to prove. By definition of t,,.x(Ao), the condition t < t,,..(Ao) implies 
that 0 < sup, P,4,[7 >t] , so that, by Lemmas 8.1.4 and 7.3.2 


0 < sup PralT>¢] 


< sup max P, 4,[T >t|S, = a] 


= max sup P,4,[T >t|S; =o] 
= maxP,,[T >t|S,;=0]. (7.25) 


Working by contradiction, assume that there exists a t-schedule o not in normal 
form but which maximizes the probability P,,|T > t|S, = o]. The non-normality 
of o implies that 


a. either there exists 7; and jy in {1,...,¢— 1} such that |c;,(t)| < |¢;,(4)| - 2, 


b. or co(t) # O and there exists j in {1,...,¢— 1} such that |c;(¢)| <n —- 1. 


We first consider case a. In this case: 


n 


Ty lel 
P,,[T>t|S=o] = J[? 
j=l 


_ I] le] . lei, (| . lej.(0)| ; 


J€{1,...t- 1} IAI J2 


Define the sequence (71,..., 7-1) derived from the sequence (|ci(¢)|,..-, /e:-1(4)|) 
by replacing |c;,(¢)| by |c;,(4)| + 1 and |c;,(¢)| by |c;,(4)| — 1. Note that 
t-1 t-1 
du = Wile) - (7.26) 
j=l j=l 


By the only-if direction of Lemma 7.5.2 we have that n < p—(lei(4)|+...+]e-1(4)|). 
By Equation 7.26 we therefore also have n < p—(71+...+%-1). By the if direction 
of Lemma 7.5.2 there exists a schedule o’ = (s/,...,s/) whose associated c-sequence 
(c(t), .--,e_1(4)) satisfies |ci(t)| = 7,1 <7 <t—1. We compute: 


(le, Ol + 1) - (le) - D len Ol lel + lel — len Ol 1 
le, Ol -le(Ol+1, 


IV 
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because, by assumption, |c;,(t)| < |c;,(¢)| — 2. 
Then 


P4,[T > t|S; = 0" 


_ Fist 
j=l n 


JEL1,...t-l} 7 Ait s2 " 
I Ol len @l lel (7.27) 


JE{1 tL FAS Jo 


= P,,[T >t|S,=0]. 


In Equation 7.27, the inequality is strict because, by Equation 7.25 and the fact 
that 


Ty lei 
Py, [T >t|S, =o] =[[——, 
j=l 


n 


(see Equation 7.14), all the terms |c;(t)|, 1 <j <¢—1 must be non-zero. 


But this contradicts the fact that, by assumption, the schedule o maximizes the 
probability P4,[T > t|S; =o] . This concludes the analysis of case a. 


Case b is similar but easier. We could proceed as in a and use Lemma 7.5.2 to 
prove non-constructively the existence of an improving schedule. Just for the sake 
of enjoyment, we adopt another another proof technique and provide an explicit 
modification of the schedule o. 


By assumption c,(t) # @ and there exists jp in {1,...,¢ — 1} such that |c;,(t)| < 
n—1. Note that the sequence (c¢;,(%))usj, is a non-increasing sequence of sets 
such that |c;,(Jo)| ca |s(jo)| = nm. Hence there must exist a time ji,jo < ji < t 
such that |c;,(J1)| < |¢;,(71)| — 1. Consider the smallest such j;. By the definition 
of the set c;,(j1), (see Definition 7.5.1), this implies the existence of an element 
z € 8(Jo)M 8(j1). Let y be any element in c(t). Define 


si) = (sir) — {2} U ty} 


and s‘(7) = s(7) for all 7 # 71. Let c{(t),...,c,_,(t) be its associated c-sequence. We 
easily check that |ci,(¢)| = |e;,(¢)| +1 and that ci(¢) = ¢;(t) forall 7,1 <7 <t,7 F jo. 
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Hence 


= Py(T >t|S,=o]. 


As in case a, this contradicts the assumption that the schedule o maximizes the 
probability P4,[T > t|S; =o] . This concludes the analysis of case b. 


7.6.5 Protocols in Prot(Prog) are optimal against Apo 


Recall that the notation P4,|T > t| S, € Ni] was introduced and defined in 
Lemma 7.6.11. 


Lemma 7.6.12 For every t, sup, P;.4,|T > t| = P,,|T > t|S; € Ni]. Further- 
more, for t < tmax(Ao), @ protocol x maximizes the probability P, 4,[T >t] if and 
only if P,[S;€ Nj] = 1. 


PROOF. 
e Let o’ be some schedule in N;. Let 2’ be a protocol such that P,: [ S; = a" = 1. 
Then 


IV 


sup P,4,[T > t] Py ao| LT > t] 


= Pra,|T >t | S; = 0") 
= P,,[T>t| S& =o’ 
= Pa [T>t|S eM), 


where the last equality is a consequence of lemma 7.6.11. 


e Conversely, let 7 be any protocol. The beginning of the following argument repeats 
the proof of Lemma 8.1.4. 


Pra[T >t] = So Prag(T >t | S: =o]: Pra [S: = 0] 
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= S> Pa. [T>t| S& =o] -P,[S;, =o] 
< max P4,[T >t | S,=0]-S°P,[S, = 0] (7.28) 
= max P4,[T >t | 5, =o] : 
= P,y,[T>t|S eM], 

where here also the last equality is a consequence of lemma 7.6.11. Hence 


sup P,.a,[T >t] < Pa,[T >t|S €M] 


which finishes to establish the first part of the lemma. Furthermore, note that the 
inequality 7.28 is an equality exactly when 


P, |S, € {o ;o maximizes P,,[T > t|S;= o]}] =1. 


By Lemma 7.6.11, for t < tinax(Ao), this happens exactly when P,[ S, € M4] = 1. 


Lemma 7.6.13 For every ™ € Prot(Prog,), for every t,t < tmax(Ao), Prof Sr € 
Ni| = 1. 


Proor. Let a € Prot(Prog,). By Lemma 7.6.8, tnax(Ao) = tmin(7o) = p-nt 1. 
Invariants g and h of Lemma 7.4.1 show that for every t,t < p—n+1, 7 only 
selects t-schedules in normal form. 


The next result expresses that, for all t, the protocols in Prot(Prog,) maximize the 
probability of survival up to time t when the adversary is the random adversary Ao. 


Corollary 7.6.14 For every 7) € Prot(Prog,), for every t, max, P,4,[T > t] = 
P,,,Aoll > t] . 


Proor. The equality is trivially true if t > tyax(Ao): in this case both terms of 
the equation are equal to 0. On the other hand, if t < t,.,(Ao), the equality is a 


direct consequence of Lemmas 7.6.12 and 7.6.13. 


We can finally provide the proof of Lemma 7.4.7, stating that 7 is optimal against 
adversary Ao. 


7.6. %) is optimal against Apo 187 
Corollary 7.6.15 For every m € Prot( Prog), max, Ey,4,[T] = Ex,,ao[Z]- 


Proor. Let a be an arbitrary protocol in Prot(Prog,). By Lemma 8.1.2, for every 
protocol 7, Fz 4,[7] = 32,3, Pra.[L > t]. Hence, 


sup E,.4,[(f] = sup S- P, a(t > t 


7 t>1 


< Sosup Pral? > tf] 


t>1 ™ 
= YP, alT >t) (7.29) 
t>1 


= Ex Aol! | . 


Equality 7.29 is a consequence of Corollary 7.6.14. 
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7.7 Ao is optimal against 7 


This section is devoted to the proof of Lemma (7.4.8). It uses in detail the code 
describing Prog, given in Page 151 to which the reader is referred. 


Notations: We use the convention established in page 156 and for each program 
variable X let X(t) denote the value held by X at the end of round t. Hence J(t) 
is the value held by the program variable J at the end of round ¢. Recall also that 
S(t) = S(t +1) =... = Si(n— pt 1) so that we abuse language and let 5; also 
denote the value $;(¢) held by the program variable S$; at the end of round t. 


For every t,1 <t < p—n+1 and every k,1 <k < t, we let A,(¢+ 1) denote the 
value |C;(t)| — |C,(¢ + 1)|. As usual, we use lower-case to denote realizations of a 
random variable and write 4,(¢+ 1) for a realization of A,(¢+ 1). 


By invariants g, h and i given in Lemma 7.4.1, for every program prog in Prog, 
the value 6,(¢+ 1) of A, (t+ 1) is uniquely determined for given values of J(t) and 
J(t+1), and hence for given values of S; and J(t + 1). 


Remember also that Up. ay(ex(t)) denotes the uniform distribution on the set Ps, ¢)(¢x(¢)) 
of subsets of ¢;,(t) which are of size 6;,(t). 


The next two lemmas, 7.7.1 and 7.7.2, characterize what is at each time ¢ the 
probability distribution induced on P,,(p) by the choice 5,4; made for round ¢ + 1 
by a program in Prog,. This distribution depends on the t-schedule o selected by 
the program in the first ¢ rounds and on the value allocated to J(¢+ 1). 


As discussed in Section 7.4.4, J(t + 1) is an internal variable of the program and 
is not measurable in the probability space (Q,G) presented in Section 7.2. (In this 
space, only events describable in terms of the schedule 5;,...,5, and in terms of the 
fault sequence F\,...,£, are measurable.) To be able to measure probabilistically 
the values allocated to this variable we therefore need to use the probability space 
(0, G', Porog) defined in Section 7.4.4. In a figurative sense, programs in Prog, appear 
as black boxes when analyzed in the probability space (Q,G): this probability space 
allows only to measure the output ($,,...,.5,) of the programs. In contrast, the 
probability space (0’,G’) allows to look inside each program prog in Prot(Prog,) 


and to analyze the internal actions it undertakes. 


Lemma 7.7.1 Let t,|p/n| <t<p—n-+1, be arbitrary, let prog € Prog,, let o be 
a t-schedule and J,7 C [t] such that P,...[S; = ¢,J(t +1) = J] > 0. Then the 


Prog 
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random variables S41, 1 ¢,(t),1 <k <t are independent and such that 
Prog Stas Mer(t) = + | Si =o, I+ =F] =Ursrniaeon| | - 


where 6;,(t+ 1) and c,(t) are the values of A,(t+1) and C,(t) uniquely determined 
by the conditions S, = 0 and J(t+1)= 7. 


Proor. Consider round ¢+ 1, and let o denote the t-schedule S; selected up to 
this point. 

Conditioned on S; = o, for every k,0 < k < t, the program variable Cj is 
determined and equal to c;,(t) — by definition of c,(¢). Hence, in the randomized 
invocations S$ := uniform(a;cx) or $ := uniform(a + 1; cx) of lines 16, 22 and 35 
of the code of Prog,, the variable Cx is equal to cx(t). Note also that no further 
change is brought in round ¢ to a variable Cx once one of the lines 18, 24 or 37 is 
executed. Hence the value allocated to $ in these invocations is equal to C(t + 1). 
These two facts imply that the randomized invocation of line 16 can be rewritten 
C(t + 1) := uniform(a;cx(t)), and that the randomized invocations of lines 22 
and 35 can be rewritten C(t + 1) := uniform(a + 1; cx(t)). 


By invariant g of Lemma 7.4.1, the sets c,(¢),0 < k <t form a partition of [p] and 
hence are disjoint. Hence the random draws associated to the various randomized 
invocations C(t + 1) := uniform(a;cx(t)) and Cx(t+ 1) := uniform(a + 1; cK (t)) 
are independent and uniformly distributed. Therefore the random variables ¢,(t) — 
C.(t¢+ 1),0<k <t, are also independent of each other. 


The value J of J(t + 1) determines precisely what are the values of k,l <k <t 
for which |Cx(t + 1)| = a+ 1 and what are the values of k,1 < k < ¢ for which 
ICk(t+ 1)| = a. After further conditioning on J({+ 1) = J, each random 
variable e,(t) — C(t + 1), is drawn uniformly from the set Ps, (¢41)(¢x(t)) of subsets 
of c,(t) which are of size 6,(t +1). (Recall that the value 6,(¢+ 1) of A;(t+ 1) is 
uniquely determined for the values o of S; and J of J(t+1).) Hence, conditioned 
on {S; = 0, J(t+1) = J}, the family (co(t) — Co(t + 1),...,a(t) — C(t + 1)) is 
distributed according to the product distribution 


t 
(Ups, ceary(enlt))- 


k=0 


On the one hand, by invariant f of Lemma 7.4.1, the random set 5,4; is equal to 
the disjoint union UL_o(c,(t) N St41). On the other hand, by construction, (see the 
code in Page 151), it is equal to the disjoint union Uj_,(c,(¢) — C.(¢ + 1)). Hence, 
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for all k,0 << k < t,o (t) AN Si41 is equal to ¢,(t) —C,(t+ 1). This (trivially) implies 
that the random variables cg(t) N $141,0 < k < t, are independent and that, for 
every k,0 <k < t, c(t) Si41 has the same distribution as c,(t) — C,(t+ 1). In 
particular, for every k,0<k <t, 


Pyros u(t) 0 Sear = + | Se = 0, (t+ Y= TF] = Ure cgoycoutoy| + | 


Lemma 7.7.2 Let t,|p/n| <t<p—n-+1, be arbitrary, let prog € Prog,, let o be 
a t-schedule and J, 7 C [t] such that P,,..| S¢ = 0, J(t +1) = J] > 0. Then 


Poros | Set =" | S, = 9, S(t+1)= J| = [1 trecesnenton| eal nN - ; 
k=0 


where 6;,(t+ 1) and c,(t) are the values of A,(t+1) and C,(t) uniquely determined 
by the conditions S, = 0 and J(t+1)= 7. 


PROOF. 
Prrog | Set =: | St = 0, J(t +1) = J| 
= Prrog | Ugao(Cu(t) 1 Sea) = Upcolee(t) V+) | Se =o, Itt 1) = TF] 
t 
= TI Poros |ex(t) A Sgr = (ee(t) +) | S =o, Jtt+)=T| (7.30) 
k=0 
t 
= TI Mes, cern (exte) [ex(t) M- . (7.31) 
k=0 


Equation 7.30 comes from the fact that the random variables ¢;,(t) 9 5:41 are inde- 
pendent. Equation 7.31 is a direct consequence of Lemma 7.7.1. 


The next lemma is a simple consequence of the preceding ones and expresses what 
is the probability that the set next selected by a protocol in Prot(Prog,) does not 
contain an element already faulty. 


Lemma 7.7.3 Lett,|p/n| <t<p—n+1 and j,1 <j <t be arbitrary. Let prog € 
Prog,, let o be a t-schedule and J, 7 © [t] such that P,,..[ 5: = 0, J(t+1) = J] > 0. 
Let f; € c;(t) be an arbitrary element. Then 


Prrog | Set 2 fi | S,=0,J(t+1)= | = Up. asy(eilt))| 2 fi 
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Proor. Recall that for every k, 6,(¢+ 1) and ¢;,(t) are the values of A,(¢+1) and 
C,(t) uniquely determined by the conditions S,; =o and J(t+1)= 7. 


By invariant f of Lemma 7.4.1, $,4; is equal to the disjoint union S41, = Usg(Si419 
c,(t)). By assumption, f; is in c;(t). Hence f; is in $,4, if and only if it is in 
Si41 1 ¢;(t). This justifies the first following equality. The second one is a direct 
consequence of Lemma 7.7.1. Hence 


Prog [St BH | Se= 0. I+) = TF 
Prrog| (S41 Mej(1) BA | Se= 9, It + 1) = F| 


= Up. asy(cs(t))| -Z fi . 


The following lemma is the crux of this section and reveals the role played by the 
uniform probability distributions used in the design of Prog,. Particularly revelatory 
is the computation allowing to pass from Equation 7.36 to Equation 7.37. 


Lemma 7.7.4 Lett,l<t<p—n+1. Then the value of the probability 
P,,[T>t+1 | T>1,8=0,F,=6| 


is the same for all protocols mo € Prot(Prog,) |', for all t-schedules 0 € N, and for 
all t-fault-sequences 6 adapted to o such that P,, [7 >t,S;=0,F;,= 6| > 0. 


Proor. Throughout the proof we consider 7,0 and ¢ such that P,,[T > t,S; = 
o, F; = | > 0 and we let prog be an element in Prog, such that 1 = Tprog: 


Recall that we let f,,..., f; denote the realizations of F\,...,/;. Hence the con- 
dition F, = @ means that (F\,..., F;) is determined and equal to (fi,...,f:) = @. 
Similarly, S; = o means that the t-schedule (5;,...,5;) is determined and equal 
to (s1,...,5;) = o. The random sets C;(u), 1 < 7 < u < t are then also uniquely 
determined and we let c;(u) denote their realizations. Recall that, by Lemma 7.5.1, 
{T > t} = nh, {F. € C,(t)}. Hence, by conditioning on {T > t,S, = 0, F, = $y}, 
we restrict ourselves to the case where f;, is in c;,(t) for every k,1 << k < t. This 
fact, along with the fact that the family (¢,(¢))i<s<: is a partition of [p] implies that 
for every 5,5 C [p], and every k,1 < k < t, the condition f, € S' is equivalent to 


fe E (9 Ve (4)). 


"See Page 150 for a definition of Prot(Prog,). 
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Case I. Assume that t < |p/n|. Then P,, [v >t+1 | T>t,S,;=0,F;=¢| =1 
and hence the result holds trivially. 


Case II. Assume that t > |p/n|. We have: 
P,,(|T>t+1 | T>1,S=0,F,=6| 

[Sat Fie Sey OM 
= P,, S41 B fis. Sra Bh 
Sigr Mei(t)) ¥ faseees (Seyi Melt) Bf 


T>tS=0,F=6| 


T>1,S,=0,F,=6| 
( T >t, S,=0, Fy = 64.32) 
Pro | (Sei Vei(t)) F fise-es (Sogn Vet) Ff 
= DF (Sus Melt) ¥ fis (Seu Melt) FA 


S=0| (7.33) 


S =o, Jt+=T| 


| Prog (t+ I =I |S =o] (7.34) 
= SOT] Powe | (Sear x(t) ¥ fe | Ss, Jttl=T| 
J k=l 
| Prog (t+ I =I |S =o] (7.35) 
= > TY .canine| 2 fe | Porog Ec +1I)=F|S = o| (7.36) 
J k=l 
= ype psa yer is =o] (7.37) 
J k=l le. (4)| 
— ici lee(é+ 0] _ _ 
= ep XS Poo [ME+D = F180] (7.38) 
_ T=: lex(t + 1)| 
Theat len ()| 


Equation 7.32 stems from the fact that, as we saw at the beginning of this proof, 
for every kj 1<k <t, fy is an element of c,(t) and hence the condition $14; % fy is 
equivalent to (S141 1 ¢,(t)) F fr. 


We used in the beginning of the proof the formula {7 > t} = M_.{F € C,(t)}. 
We now find it convenient to use the alternate expression {7 > t} = Ni, {Su 
{F,,...,F.-1} = 0}. (Both expressions are presented in Lemma 7.5.1). The event 
{F, = 6, T > t} is equal to the event {F, = 6, Mu {Su A f{fi,---s fui} = OF} 
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which,when conditioned on S; = ¢, is itself equal to the event {F, = 4, Nar {Su N 
{fis---sfu-1} = O}}. Note that the relation (V,_, {5.9 {fi,---,fu-1} = 0} only 
expresses a relation between the deterministic quantities s,, f,,...,5;. Hence the 
probabilistic event {F, = 6, (i {su A {fi,---,fu-1} = Ot} is actually equal to 
{F, = o}. As, by Lemma 7.3.1, conditioned on S; = 0, the random variables F, and 
Sr41 are independent, we therefore similarly have that, conditioned on S; = a, the 
random variable $;,, is independent from the event {F, = ¢, T >t}. This justifies 
Equation 7.33. 


In Equation 7.34, we condition on the value taken by J(¢+ 1), ie., on the outcome 
of an internal action of the program prog associated to 7). As discussed at the 
beginning of this section, the random variable J(¢ + 1) is not measurable in the 
o-field associated to protocols, and we must therefore consider the probability space 


(2G, P 


prog 
explains why we consider P,,., in place of P,,. 


) allowing to measure the randomized invocations made by prog. This 


By Lemma 7.7.1 the random variables 5:41 Mc,(¢), 1 < & < t are independent. This 
justifies Equation 7.35. Using Lemma 7.7.3 along with the fact that Si4, % fy is 
equivalent to (5:41 1¢,(t)) X f, justifies Equation 7.36. 


Note that we replaced in Equation 7.32 the conditions $;4, 2 f, by the conditions 
(S141 Mex(t)) Z f, and that we replaced back each condition (S441 M cg(t)) F fe by 
S141 2 fe while establishing Equation 7.36. The reason for the introduction of the 
sets c;,(t) is to be able to use the (conditional) independence of the random variables 
S141 1(t) and to obtain the product introduced in Equation 7.35. 


io) 


Recall that we defined 6,(¢+ 1) to be equal to the value |e,(¢)| — |e,(¢ + 1)|. Hence 
Up, aanlaxtyyl te E+] is equal to Up, ., cecil fe € +] which is equal to |e,(t + 
1)|/|e,(t)|. This establishes Equation 7.37. Note that, for each k, the value c,(¢+ 1) 
does depend on the values o and 7 taken by S; and J(t+ 1). Nevertheless, by 
Lemma 7.4.5, the product [],_, |ce(¢ + 1)| (resp. TT}, Jez (t)|) is the same for all 
programs prog € Prog, and all values o, 7 and ¢ taken by S,, J(t +1) and F,. 
We can therefore factor the term [],_, |ce(¢ + 1)|/T];.—, |ex(4)| out of the summation 
over J. This justifies Equation 7.38. 


This establishes that P,,[7 > t+1 | T >t, S = 0, F = | is the same for 
all protocols 79 € Prot(Prog,), all t-schedules o € MN, and all t-fault-sequences ¢ 
adapted to o. 


Note that if k < [p/n] then c,(w) = 5; for all times u,k <u < |p/n|. Therefore 
len(t + 1)|/le,(4)| = 1 for allt and kk, lL <k<t< [p/n]. 


Assume now that t > p—n+ 1. Let o be a t-schedule and let s € P,,(p) be such 
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that (o,s) isat+1-schedule. Let 7 be a protocol in Prot (Prog) Pgenerating(c). 
By Lemma 7.6.8, tmax(Ao) = p—2+1. By definition of t,,,,(Ao) this implies that 
P4(T >t+1 | Siy1 = (o,5)] = 0 and by consequence that (o,s) ¢ Feas,,. Hence, 
as is established at the beginning of the proof of Lemma 7.6.10, Tj-1 Ie,(t+ 1] = 
0. Using the convention 0/0 = 0 (see Convention 8.1.1, page 198) we see that 
(TTa1 lee(t + 1)))/(Tp =: lea(t)|) = 0. On the other hand, again by Lemma 7.6.8 and 
Convention 8.1.1, P,,[/7 >t+1|]T>t,S,;=0,F,=¢)=0ift>p—n+1. 


This shows that the equality 


Tas lew(t + I 
Teas lee()I 


P.. [r>t+1 | 7248, o, F, 6| 


Oo 


holds for every t,t > 1. 


This result, along with Formula 7.13 in Lemma 7.6.10 establishes the following 
remarkable fact:'? 


For every t,t > 1, every t + 1-schedule (0,5) € Ni4i, every t-fault-sequence ¢ 
adapted to o and every protocol 7) € Prot(Prog,) N Pgenerating(c): 


Pa,[T>t+1 | T >t, S41 =(0,5)| 


P,,|T>t+1|T>t,S=0,F =| 


where the values ¢;(t),1 <j <t, and ¢(t+1),1< 7 <t, are the c-values related to 
the (¢ + 1)-schedule (a, s). 


This result constitutes the core technical result of this chapter, whose consequences 
ultimately lead to the fundamental equality 


sup Ey Ap [T] = Exo Ao [T] = inf Ex ALL] ’ 
establishing Theorem 7.4.6. 


Lemma 7.7.5 Let t,1 <t <p—n+1. Then, for every m € Prot(Prog,), the 
probability P,, a[T >t+1 |T >t] is independent of the adversary A. 


Recall that, by convention, we set 0/0 = 0. 
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PROOF. 
Proa|T > t+ | T>1] 


= oP. [L > ttl | T>1,S=0,F,=6| Po a[S = 0, F. = 6| 
oe 


= P,,[T2t+1| Tt S=0,F:= 6] Yo Pr al Se =o, Fi = 97.39) 
o,9 


- P,.[T>tt1| Trt S=0,F=6]. 


Equation 7.39 holds because, by Lemma 7.7.4, the value P,,[ T >¢t+1 | T > 
t, S; = 0, F; = ¢] is the same for all o and ¢ (adapted to a). Furthermore, the 
fact that the value Pro | T>t+il | T>t,S; =0,F, = é| does not depend on 
immediately implies that it similarly does not depend on A, which concludes the 


proof. 


Lemma 7.7.6 Let t,1 <t <p—n+1. Then, for every m) € Prot(Prog,), the 
probability P,, a[T >t] is independent of the adversary A. 


Proor. P,, 4[f >t] = P,, a[f >t|P>t-1)...P,,4[f > 2| 7 > 1). The result 
then follows from Lemma 7.7.5. 


We are now finally able to prove Lemma (7.4.8). 


Corollary 7.7.7 For every 7 € Prot(Prog,), the expectation E,, 4[1] is indepen- 
dent of the adversary A. 


PROOF. 
Ex All] = S> Pr alT > t] (7.40) 
t>l 
tmax(A) 
= SO P,,,all >t (7.41) 
tol 
trin(%o) 
= P,,, all > t] (7.42) 


t=1 


Equality 7.40 is justified by Lemma 8.1.2. Equality 7.41 by the fact that P,, 4[T > 
t] = Oift > tuax(A). By Lemma 7.6.8, for all A, tnax(A) = tnin(to) = p-n+l. This 
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justifies Equality 7.42. By Lemma 7.7.6, for every t,1 < t < tnin(o), each of the 
terms P,, 4[T > t] is independent of A. Hence the whole sum Soin (70) P,, all >t 
is independent of A. 


Chapter 8 


General Mathematical Results 


8.1 Some Useful Notions in Probability Theory 


Notation: 1) For every finite set ®, ® Z (), we let Us denote the uniform distribution 
of the set ®. We will find it convenient to extend this definition to ® = @ and let 
Uy(A) be identically equal to 0 for every set A. (Hence U% is not a probability 
distribution. ) 

2) For every set X we let P(X) denote the power set of X. For every integer é we 
let P;(X) denote the set of subsets of X of size é: 


P(X) SLY CX3|Y| = 6}. 


Hence, Po( X) = {0} and P;(X) = 0 for all 6 > |X|. To simplify the notations we 
write P,,(p) in place of P,,([N]). 

3) For every integers k and J, k < J, we let [/] denote the set {1,...,/} and let 
Pi(1) = {835 [I], |s] = k}. 


We can mix the two previous definitions and consider Up,x), the uniform distribu- 
tion on P;(X). The following lemma investigates some special cases associated to 


this situation. 


Lemma 8.1.1 1) Up,x) = 0 if 6 > |X|. 2) Upyx) = 69, the Dirac measure on 0). 
Equivalently, Up,x) 1s the probability measure selecting the empty set with probability 
1. 


Proor. Assume that 6 > |X|. Then, for every set A, Up,(x)(A) = Uy(A) = 0. On 
the other hand, Up,x) = Uso; is the uniform distribution on the singleton set {0}, 
i.e., the distribution selecting @ with probability one. 
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Convention 8.1.1 Let (Q,G,P) be some probability space. For all A and B inG 
we set P| B|A] = 0 whenever P[A] = 0. We accordingly set to 0 the ratio 0/0. 


Definition 8.1.1 Let (Q,G,P) be a probability space. For every sets A and B in 
G we say that A and B are P-equivalent if P[AAB] = 0, where AAB denotes the 
symmetric difference of A and B. An element A of G is called a P-atom if, for 
every B inG, B is a subset of A only if B is P-equivalent to A or to 9. 


Definition 8.1.2 Let (Q1,G%, P,) be some probability space and let X :(Q1,%, Pi) > 
(Qs, G2) be a random variable. Then the law of X is the probability distribution on 
(Qa, Go) defined by: 

VB € Go, P,[B] = PX € Bl. 
We write 
For A € G, P|A] £0, the random variable X|A is by definition a random variable 
X' whose law P3 is given by: 


VBE Go, P3[B] = PX’ € B 
= PX €B| Al. 


[an 


Following Convention 8.1.1, we set X|A = 0 if P[A] = 0. 


Lemma 8.1.2 Let T be any non-negative random variable defined on some proba- 


bility space (Q,G,P). Then 
eir|= | PIT > tldt. 
0 
If T is integer valued, the preceding translates into 


BIT) =) PIT > g. 


t=1 
Proor. For every interval [0,7[ we let xjo,r, denote the function equal to 1 on 


[0,7| and 0 elsewhere. Also, for simplicity of notation, we think of the expectation 
E as being an operator and let Ef denote FE] f]. 


T 
er = Ef a (8.1) 
0 
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lI 
o 
yy 


EL x , [(¢) dt 
(0) [ xorenltat (8.2) 


(x) Xpo,r@y((t) (8.3) 


lI 
a 
a 
ae) 


| 
— 
= 
a 
a 
aS 


= P(T > t]dt. 


0 


Equation 8.1 comes from the simple observation that T = fo dt. Equation 8.2 is 
a particular case of the general formula F[f] = ff(x)dP(a). Equation 8.3 is an 
application of Fubini’s theorem. In the case where T is integer valued, 


EIT) = [ PIT > t]dt 


The following lemma states a well-known property of the uniform distribution. 


Lemma 8.1.3 Let A be a finite set and let B be a non empty subset of A. Then 
Ual | B] =Uaon. 


Lemma 8.1.4 Let (Q,G,P) be some probability space, let S :(Q,G) + (E,P(E)) 
be a random variable defined on Q. Then for every B EG, 


P[B) < max P[B | 5 = s]. 
see 


Proor. P(£) being the o-field attached to F the set {5 = s} is measurable for 
every sin F. Hence P[B | S = s] is well-defined if P[S = s] 4 0. On the other 
hand, by Convention 8.1.1, P[B | S = s] is set to 0 when P[S = s] = 0. Hence 
P[B | S =] is well-defined for all s in F. 


PIB] = S>P[B|S =s]P[S = 5] 
< max P[B | S= s] > P[S = S| 


= maxP[B|S = sj. 
see 
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Lemma 8.1.5 Let B and A be any real-valued random variables. Then 


Ve ER, P/B>«|B<A)<P[B>2].! 


Proor. The result is obvious if P| B < A] = 0. We can therefore restrict ourselves 
to the case P| B < A] > 0. Assume first that A is a constant a. 


If a < « the result is trivially true, so we assume that a > x. Set p = P[B > a] and 
0 = P[B > «}. Then: 


PIB22|B<al = Pape 


Fab & 65(p). 


For 3 € [0,1], The function ¢, is not increasing on [0,1). Hence: 


P[B>x|B<a] < (0) 
= PIB> x]. 


We then turn to the case where A is a general random variable.” We will let dP, 
denote the distribution of A so that, for any measurable set U, dP4[U] = P[A € U]. 
Then: 


a) 
te 
IV 
& 
Vv 
8 
Tl 


[PlazB > 2] dP,(a) 


[Pp e| Bsa] PB <a) dP4(a) 


lA 


[Pe > 2] PIB <a] dPy(a) 


(We use here the result valid for A = a constant) 


— PIB>2] [re < a] dP,(a) 
~ PIB> 2) PIB< Al. 


Then we just need to divide by P[B < A] to get the result we are after. 


‘Recall that, using Convention 8.1.1, we set 0/0 = 0 whenever this quantity arises in the com- 
putation of conditional probabilities. 

?Note that we cannot simply extend the previous proof in this case: A can sometimes be less 
then x and then it is not true anymore that {B > A} C {c < B}. We used this when saying that 
Pla<B<aj=Plx < BJ)- P[B> al. 
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If A is a discrete variable, its distribution is absolutely continuous with respect to 
the counting measure, so that the expression {[, Pla > B > «x]dP,(a) reduces to 
>, Pla> B>x|PiA= al. 

We recall in the following definition the notion of stochastic (partial) ordering: this 


ordering is defined within the set of real random variables. 


Definition 8.1.3 Let X and Y be two real random variables. Then the following 
conditions are equivalent: 


1. For alla eR, PIX >a] < PLY > a}. 


2. For all continuous and bounded function f, [floarst) < [foarr(a) 
(i.€., dPx < dPy ). 


We then say that the X is stochastically smaller then Y (or alternatively smaller in 
law) and we write that: X <, Y. 


Note that if X < Y in the usual sense (i.e., almost surely), then X is also stochasti- 
cally smaller then Y. Actually, among all the usual orders that are usually consid- 
ered on the set of random variables (e.g., almost surely, for some L?-norm, for the 
essential norm L©) the stochastic ordering is the weakest. 


Formulated in this language, Lemma 8.1.5 just says that 


(B| B< A)<c B. 
The result of Lemma 8.1.5 can be generalized into: 


Lemma 8.1.6 Let A, , Ay and B be any real-valued random variables. ? Assume 
that A, <¢ Ay. Then 
(B| B< A,) <c (B| BK Ap). 


Proor. The proof is similar to the one of Lemma 8.1.5. (Lemma 8.1.5 corresponds 
to the case Ay = ov.) 


Lemma 8.1.7 Let A, B and C be any real-valued random variables. Then 
P[B>C|B<A)< PIB>C). 


° As is customary in measure theory, we allow the random variables to take the oo value. 
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Proor. We integrate the inequality of Lemma 8.1.5: 
P[B>C|B< A= [P[B>2| BS AldPo(e) 


< [re > 2] dPo(x) = PIB > Cl. 


Lemma 8.1.8 Let A, , Ay and B be any real-valued random variables. Assume 
that A, <¢ Ay. Then 
P{A, > B] < P[A2 > 8B). 


Proor. P[A, > Bl = [PI A, > a] dPp(a )< [Pl Ay > a] dPp(x) = PAs > BI. 


8.2 Max-Min Inequalities 


Lemma 8.2.1 Let f(x,y) be a reel function of two variables defined over a domain 
Xx Y. Then 


sup inf flay < inf sup flay). 
wEex YE 


Proor. Let a and yo be two arbitrary elements in X and Y respectively. Obvi- 
ously, f(2%o, Yo) < sup, f(x, yo). This inequality is true for every yo so that inf, f(x, y) 
< inf, sup, f(x,y). Note that inf, sup, f(2,y) is a constant. The last inequality is 
true for every xo so that sup, inf, f(%o, y) < inf, sup, f(z, y). 


Lemma 8.2.2 Let f(2,y) be a reel function of two variables defined over a domain 
Xx Y. For every %) € X and yo € Y we have infyey f(%o,y) < suppex f(®, yo): 
Furthermore, the previous inequality is an equality only if 


max inf f(x,y) = min sup f(x,y) = f(%o, Yo). 
yey cEex 


vrEX yeY 
PROOF. 
inf feosy) <_ sup inf f(x,y) 
yeY eex YEY 
< inf sup f(a,y) (by Lemma 8.2.1) 
yeY 
< 


sup Fasy). 
rex 
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If infyey f(%o,¥) = sup,ex f(%, yo) the previous inequalities are equalities so that 
infyey f(%o,y¥) = sup,ex infyey f(a, y). This shows that the previous sup is achieved 
(for « = a) and hence that sup,¢y infyey f(t, y) = Maxzex infyey f(x,y). Simi- 
larly infyey suppex f(%,y) = Minyey sup,ex f(#,y). To finish, note that, obviously, 
infyey f(%0,y) < flo, Yo) < suprex f(%, yo). Hence the equality infyey f(to,y) = 
suprex f(%, Yo) occurs only if maxzexinfyey f(v,y) = minyey sup,cy f(%,y) = 


f(®o, Yo). 


The following result strengthens the preceding one. 


Proposition 8.2.3 Let f(x,y) be a reel function of two variables defined over a 
domain X x Y. Define O, = {x' € X; infyey f(a’, y) = supzex infyey f(x, y)} and 
similarly O. = {y' © Y; suppex f(x,y’) = infyey supzex f(a, y)}. Then there exists 
Zo € X and yo € Y such that 


inf f(vo,y) = sup f(a, yo) 
yeY cExX 


if and only if 


max inf f(e,y) = min sup f(%,y). 


Furthermore, if this condition is satisfied, O; and Og are both non-empty and for 
every &o € O; and every yo € Oz we have 


inf f(%0,y) = sup F(&, Yo) = F(o, Yo) = max inf f(t,y) = min sup f(x,y). 


Proor. By Lemma 8.2.2, there exists a € X and yo € Y such that 
inf f(®o, y) = sup f(a, Yo) 
yey vex 


only if maxzex infyey f(v,y) = minycy sup,cx f(*, y). Conversely assume that 
mMaxXrex infyey f(%,y) = minyey SUP, cx f(x,y). Part of the assumption is that 
suprex iMfyey f(@,y) = max,ex infyey f(z,y) which means that O, is non-empty. 
Symmetrically, O2 is non-empty. Let 2) and yo be any two elements of O; and O» 
respectively. By definition, inf, f(%o, y) = maxzex infyey f(x, y) and sup, f(a, yo) = 
minyey SUP,cx f(*,y). By assumption these two values are equal so that infy f(x, y) 
= sup, f(#, yo). By Lemma 8.2.2, this common value is equal to f(a, yo). 


Lemma 8.2.4 Let X and Y be two sets such that for every (a,y) © X x Y, 
(dryer is a sequence of non-negative numbers. For every x © X we define 


def 


tmin(@) = sup { t; inf a, y(t) > 0}. 
y 
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Similarly, for every y © Y we define 
tinax(Y) S sup { t; sup Ay y(t) > 0 }. 


Then 
sup brain (x) < inf tmax(Y) . 
Ea y 


Proor. Define Y’ to be the set Y’ © {y € V3 tmax(y) < 00}. The result is obvious if 
tax(y) = 00 for every y € Y. We therefore consider the case where tyax(y) < 00 for 
some y in Y, i.e., when Y’ is non empty. Furthermore, the equality infyey tnax(y) = 
infyey’ tmax(y) shows that we can restrict our analysis to within Y’ in place of Y. Let 
x, and y, be two arbitrary elements from X and Y’ respectively. The definition of 
tmax(Y1) implies that dy, y,(tmax(¥i) +1) = 0 and hence that inf, ae, y(tmax(¥i) +1) = 
0. By definition of t,i,(@1) this implies that thin(@1) < tmax(y1). The elements 2, and 
y, being arbitrary, we obtain that 


sup tnin( 2) < inf tmax( Y) . 
Ea y 


By assumption, all the numbers a,,,(¢) are non-negative and inf, t,,.,(y) < 00 so 
that 0 < sup, trin(w) < infy toax(y) < oo. The inequality sup, tnin(@) < oo shows 
that sup, tmin(®) = max, tyin(v). Similarly, the inequality 0 < inf, t,...(y) implies 
that infy tnax(y) = miny taax(y). Therefore, in the case where t,,.,(y) < oo, the 
max-min inequality can be strengthened into 


max tin(@) < min tnax(Y)- 
+ y 


8.3 Some Elements of Game Theory 


Recall that, for all column vector X, X7 denotes the transpose of X. For any integer 
k we let J, denote the k-dimensional fundamental simplex and 7 denote the set of 
extreme points of this simplex: 


Th = {tse An) € (0,1; A= i}. 


Ti = {(1,0,...,0),(0,1,---,0),--+5(0,0,...,1)}. 


The following theorem is the celebrated result of Von Neumann which initiated the 
field of game theory. 
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Theorem 8.3.1 (Von Neumann) for any k x | real matrix M = (Mj;)ij;, the 
equality 
max min X’ MY = min max X’ MY 


X€T, YET; YET; X€T} 


is valid. Furthermore, this common value is equal to: 


max min Y¥? MY and minmax X’ MY. 


XET;, yer YET; X€T, 


Note that, (X,Y) — X7MY is continuous on the compact 7, x Jj. Hence for 
every X we can find an element Yx in 7 such that infyez, X’ MY = X7MYx. 
Furthermore we easily check that we we can select Yy in such a way that X — Yx 
defines a continuous function on J. This immediately implies that 


sup inf X7MY = sup X’MYx 
XET;, YeT, XET;, 
= max X’MYx 
X€T;, 
= maxminX7MY. 


X€T, yer, 


We similarly easily establish that inf, 2, SUP ver, X?TMY = MiNy¢7, MAX yer, X? MY. 
Note also that the inequality sup, infy X7MY < infy supy X¥’ MY is a direct 
consequence of Lemma 8.2.1. A proof of the converse can easily be derived with the 
use of the duality result in Linear Programming. 


The game theory interpretation of this result is as follows. Consider two players 
Player(1) and Player(2) involved in a game (G, II, A) with a performance function 
f. Gis the set of rules of the game describing how the game is played between the 
two players, which actions (and in which order) are to be undertaken by the players 
and in particular which information is traded between the two players during the 
execution of a game. II is the set of allowable strategies of Player(1), A is the set of 
allowable strategies of Player(2): IL (resp. A) is a subset of the set of strategies of 
Player(1) (resp. Player(2)) compatible with the rules of the game G. Note that, by 
definition, a strategy of either player is defined independently of the strategy chosen 
by the other player. f: Ix A — Ris a performance function measuring “how well” 
a given strategy a of Player(1) does when implemented again a given strategy A 
of Player(2). We assume that the game is a zero-sum noncooperative game which 
means that one of the two players, say Player(1), wants to chose its strategy so as 
to maximize the performance f(a,A) and that the other player, Player(2), wants 
to chose its strategy so as to minimize the performance. 


We consider the sequential case where one player chooses first a strategy and where 
the other player then chooses his. Hence, if Player(1) plays first, for every ¢ > 0, 
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the two competing players can choose their respective strategies 7 and A so as to 
bring f(a,.A) to within ¢ of sup, infy f(a,A). Conversely, if Player(2) plays first, 
for every ¢ > 0, the two competing players can choose their respective strategies 7 
and A so as to bring f(7,A) to within ¢ of inf, sup, f(a,A). By Lemma 8.2.1, 


sup inf f(a, A) < inf sup f(a, A), (8.4) 


which expresses that, for either player, selecting its strategy last can only be benefi- 
cial. Generally, the inequality is strict, i.e., there is a definite advantage in choosing 
its strategy last. 


An interpretation of this inequality is that, eventhough no explicit exchange of in- 
formation is performed between the two players when they select their respective 
strategies, the player selecting its strategy last can be assumed to know the strat- 
egy selected by the player selecting its strategy first. The reason for it is that, for 
every given choice 7 made by Player(1), there is a Player(2) that “always assumes” 
that Player(1) chooses 7 and that makes an optimized decision A based on this 
assumption. Such a Player(2) is by construction optimal if Player(1) does chose 7. 
Based on this remark we say that an optimal Player(2) knows implicitly the iden- 
tity of Player(1) in the expression sup, inf, f(a,A). Symmetrically, we say that 
an optimal Player(1) knows implicitly the identity of Player(2) in the expression 
inf, sup, f(7, A). 


In this setting, the strict inequality in Equation 8.4 expresses precisely that allocat- 
ing to one or the other player the possibility of spying on the competitor’s strategy 
affects the performance of the game. This interpretation of Inequality 8.4 will be 
very useful in the rest of the discussion. 


Consider now the case where Player(1) is provided with a set I of size k and 
Player(2) is provided with a set J of size 1. Assume that to each pair (7,7) in I x J 
is associated a cost M;; in R. Let ((G, II, A), f) be a game with a performance func- 
tion where II = 1, A = J, f(t,7) = M;; and where the rules are the trivial rules: “do 
nothing”. In the case where Player(1) and Player(2) both choose their strategies 
optimally and where Player(1) chooses first, the performance associated to the game 
is max; min; M;;. Conversely, if Player(2) plays first, the performance associated 
to the game is min; max; M;;. As discussed above, max; min; M;; < min; max; M; ; 
and, generally, the inequality is strict, i.e., there is a definite advantage in making 
its choice last. 


We consider now the case where the players are allowed to make random choices. 
The following theorem formalizes the fact that, in a game of cards played by two 
players, knowing the opponent’s strategy confers no advantage for winning any single 
hand, provided that the hand finishes in a bounded number of transactions. 
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We abuse language and identify a probability distribution with the procedure con- 
sisting of drawing an element at random according to this probability distribution. 


Theorem 8.3.2 Consider a two-players game (G,II,A) having the property that 
there exists two finite sets I and J such that II is the set of probability distributions 
on I and such that A is the set of probability distributions on J. Let T be a function 
on Ix J. (T is often called the cost function.) For every x in IL and every A in 
A we let FE, ,[T] denote the expected value of T when Player(1) selects an element 
in I according to the distribution 7 and when Player(2) selects an element in J 
according to the distribution A. Then 
ma aaa Peal] = panting Peal. 

The sets J and J are often called the set of pure strategies. The sets I and A are 
often called the set of mixed strategies. 


Proor. A probability distribution on J is represented by a k-tuple (Ai,..., Ax) of 
non-negative numbers summing up to one. Equivalently, J, is the set of probability 
distributions on J and similarly 7 is the set of probability distributions on J. By 
assumption II can be identified to 7, and a strategy a in II can be represented by 
element X in J. Similarly, A can be identified to TJ and a strategy A in A can be 
represented by element Y in J. For every (i,j) € I x J we write M;; = T(2,7) and 
we let M represent the associated matrix: by construction M is symmetric. Using 
these associations, consider the case where Player(1) chooses the strategy X € TJ, 
and where Player(2) chooses the strategy Y € 7;. The quantity X7 MY represents 
the expected value of the cost 7’ obtained under these strategies. Theorem 8.3.2 is 


therefore a direct application of Theorem 8.3.1. 


A game as one described in Theorem 8.3.2 is is often called a matrix game. 


Explicit /Implicit knowledge. In the course of the previous discussion we intro- 
duced the notion of implicit knowledge. We present here an abstract summarizing 
this concept. 


We say that a player, say Player(2), receives explicitly some information x during 
the course of an execution when the rules of the game (i.e., when considering al- 
gorithms, the I[/A-structure) specify that that Player(2) be informed of z. We 
can say figuratively that Player(2) “receives a message” carrying the information 2. 
Note that, in this situation, Player(2) receives the information x independently of 
the strategy that it follows. 
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Consider now a function of two variables f(a,A),(%,A) © X x A and consider the 
expression inf, f(@,A). We can consider this situation as a game parameterized 
by « played by Player(2): Player(2) tries to decide A so as to minimize f(a, A). 
Eventhough there might be no mechanism letting Player(2) explicitly know what 
the parameter x is, when considering the expression inf, f(a,A) we can assume 
that an optimal (and lucky) Player(2) selects non-deterministically an A bringing 
the function f(a,A) arbitrarily close to inf, f(@,A): we then say that an optimal 
Player(2) knows implicitly the parameter x selected. This knowledge is not a real 
knowledge and is of course nothing more then a heuristic meaning that some choice 
of A corresponds (“by chance”) to the choice that some informed Player(2) would 
make. Note that, in contrast to the case of explicit knowledge, Player(2) is said to 
“have the implicit knowledge” when it chooses a “good” strategy A. 


We provide two examples of this situation. When considering the formula 


inf 
sup inf f(a, A) 
we will say that Player(2) knows implicitly the value 7. Also when considering the 
formula 


inf PalC | Bl 


we will say that Player(2) knows implicitly that the sample space is restricted to 
within B. 
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